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THE CONSTRUCTION AND USES OF FRACTIONAL 
FACTORIAL DESIGNS IN INDUSTRIAL RESEARCH 


O. L. Davies anp W. A. Hay 


Biological Laboratories, Imperial Chemical (Pharmaceuticals) Ltd., 
Blackley, Manchester 


1. Introduction 


AC IMPORTANT APPLICATION Of statistical methods to industrial re- 
search is in the design and analysis of experiments in connection 
with the improvement of manufacturing processes. This applies to 
chemical and biological processes, formulation of pharmaceutical prepa- 
rations and, in fact, to most types of industrial research. For example, 
in chemical research it is frequently required to determine the effect 
of certain changes in reaction conditions or methods of manufacture 
on the yield and quality of chemical products. Typical reaction condi- 
tions which might be varied are temperature of reaction, time of reaction, 
rates and methods of agitation, concentration and amounts of reactants, 
different catalysts, etc. Similar considerations arise in chemicals manu- 
factured by fermentation processes, e.g. penicillin, streptomycin, in- 
dustrial alcohol, lactic acid, etc., for which we may wish to examine the 
effect of changes in the conditions of fermentation on the yield and 
quality of the products. The object of all such investigations is usually 
to improve the yield or the quality of the product, or to produce the 
product more economically. 

This type of research may involve the examination of many different 
factors and the problem is how best to design the experiments to 
examine the effects of these factors. In this type of investigation it is 
usually sufficient, in one experiment, to examine one change in each 
of the factors under investigation, for example, an increase or decrease 
in temperature of a fermentation process, an increase or decrease in 
the concentration and amounts of one or more of the various con- 
stituents of the medium, etc. The complete factorial design, where 
the factors are examined in all possible combinations, has long been 
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accepted as the best answer to this type of investigation provided the 
number of factors is not large. The number of combinations is doubled 
for every additional factor and for a large number of factors the number 
of combinations involved often renders a complete factorial design 
prohibitive and uneconomical. In the more expensive and time con- 
suming processes even a 2° factorial experiment would usually be re- 
garded as prohibitive. The main advantages of a factorial design are 
that (a) it contains a large degree of ‘hidden replication’; for example in 
a 2° experiment each effect is estimated with the precision of an average 
of 16 comparisons of pairs of observations, and (b) it supplies estimates 
of all interactions. When the experimental error is large and/or when 
high order interactions are expected to be appreciable, then there is no 
satisfactory alternative to the complete factorial design. Frequently, 
however, experimental error is not large and the precision supplied by 
a complete factorial design is not required. Moreover, in many in- 
vestigations high order interactions are not of appreciable magnitude, 
and in such cases the complete factorial design is unnecessary. The 
answer to experimental situations of this kind is in the use of ‘Fractional 
Factorial Designs’ or ‘Fractional Replication’. Many papers have 
appeared during the past few years on this subject and some of these 
are listed in the references at the end of this paper. Unfortunately 
for the industrial research worker, these papers written largely from a 
mathematical standpoint, have tended to mask the essential simplicity 
of these very useful and valuable designs. The purpose of this paper 
is to draw attention to a simple method of constructing fractional 
factorial designs and to discuss their uses in industrial research. 


2. Construction of Fractional Factorial Designs. 


Even with complex industrial processes a design of eight observations 
can usually be carried out without any serious practical difficulty, and 
in general, such a design would be regarded as the minimum for satis- 
factory results. For three factors each at two levels, eight observations 
would constitute a complete factorial design and when only three 
factors require to be investigated, the complete factorial design would 
be carried out. Denoting the three factors in the usual way, by the 
capital letters A, B and C, the normal condition (or the lower level of 
each factor) by (1), and the change in condition of a factor (or the 
higher level) by the corresponding small letter, then the eight combina- 
tions of the complete factorial design are 


(1), a, b, C, ab, ac, be, abc 


consisting of (1), commonly called the ‘control’, a change in each factor 
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one at a time, two at a time and all three together. To calculate the 
effect of A, we simply subtract the mean of the four observations not 
containing ‘a’ from the mean of the four containing ‘a’, and similarly 
for the main effects of B and C. In other words each of these effects, 
apart from the factor 3, is represented by the sum of four given ob- 
servations minus the others. Also, all interactions are represented in 
a similar way and the effects can be conveniently set out in the fol- 
lowing tabular form:— 


TABLE 1. 
Observations or treatment combinations 
Effect (1) a b c ab ac be abe 
A + + + + 
AC + + + + 
ABC + = = = 


The signs in each row are those to be attached to each observation for 
the purpose of calculating the given effects. For each main effect the 
signs for the treatments containing the corresponding small letters are 
plus and the remainder minus. The signs for each interaction are 
the products of the corresponding signs for the constituent main effects, 
eg. the signs for ABC are the products of the corresponding signs A, 
Band C. This table is the same, except in arrangement, as that given 
by Yates (ref. 3, p. 11). If the minus sign is used to denote the lower 
level or the normal condition of the factor, and the plus sign the higher 
level or the change in the condition of the factor, the first three lines 
of the table can also be used as a convenient form of representing the 
design. Thus for the first observation A, B and C are minus which 
represent the normal condition of all the factors; for the second ob- 
servation, A is plus with B and C minus, therefore this denotes a change 
in A only, represented by ‘a’ in the former notation, and so on. The 
dual purpose of the table of signs is very convenient and, as will now 
be shown, may be used to facilitate the construction of fractional 
factorial designs. 

Suppose it is required to examine four factors, A, B, C, D in eight 
observations, then this can be done by using the signs of one of the 
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interactions of Table 1, preferably ABC, to represent the effect of D. 
In other words, the (+) signs of the row ABC are made to represent 
the higher level of the fourth factor D and the (—) signs the lower 
level of D. This operation may be referred to as equating the factor 
D to the three-factor interaction ABC. The design then becomes:— 


! 


TABLE 2. 

(1) bd cd ab ac be abcd 

B = + = + = 

Cc = + + + 

D (= ABC) - + + + = = = + 
AB (= CD) + 4. = + 
AC (= BD) + - + = = + = + 
BC (= AD) = = + 


This is one half of the complete factorial design for the four factors 
A, B, C, D. The method of constructing this is the same as that of 
confounding the interaction ABC between two blocks and identifying 
the blocks with the levels of the fourth factor D. 

It is seen that the product of the signs of A and B is the same as 
the product of the signs of C and D, therefore, in equating ABC to D, 
AB has also been equated to CD, similarly AC has been equated to 
BD, and BC to AD. Moreover, the product of any three of A, B, C 
and D is equal to the fourth, and therefore A has been equated to 
BCD, B to ACD and C to ABD. (These are the familiar aliases dis- 
cussed by Finney, 1, 2, and other authors). It is not possible from 
the above design to obtain an estimate of AB free of CD or vice versa, 
and similarly for any other such pair, and therefore, the above design 
should be used only when some of the interactions can be assumed 

negligible. This is not necessarily a limitation because in a given 
investigation the experimenter may know, from theoretical considera- 
tions or from previous experience with the same or similar processes, 
that certain interactions are not likely to be appreciable or are less 
likely to arise than others. For instance, in most chemical reactions, 
an interaction would be expected between the time of reaction and the 
temperature of reaction, particularly around those conditions which 
result in the highest yield or best quality of the product. Where the 
same comparison measures two effects it is usually possible to attribute 
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most if not all the comparison to one of the effects, the other being 
rejected from prior knowledge. With four factors under investigation 
there would usually be no difficulty in arranging the conditions so that 
any two-factor interactions likely to exist would be confined to inter- 
actions between three of the factors, that is to say, one of the factors 
would be free of interactions with the other three. 

In Table 2, if A, B and C are the three factors amongst which some 
interactions may exist and D is the fourth factor having no interaction 
with the other three, then the design will estimate the four main effects 
and the interactions AB, AC and BC. 

If it is required to examine five factors, then at least one other two- 
factor interaction, e.g. AB, must be assumed zero and the fifth factor 
E would then be equated to AB in the usual way, that is, the pluses of 
AB would correspond to the higher level of # and the minuses to the 
lower level. The resulting design is a quarter factorial design. If yet 
another interaction, say AC’, may be assumed zero, a sixth factor may 
be included provided this additional factor does not interact with any 
of the other five, and in the limit, if all interactions are zero, seven 
factors may be examined by equating the additional factor to the re- 
maining interaction BC. 

Summarising the above, it is seen that the possibilities with eight 
observations are:— 


(1) Seven factors if all interactions are negligible 

(2) Six factors and one two-factor interaction if all other inter- 
actions are negligible 

(3) Five factors and the interaction of one factor with each of two 
others if all other interactions are negligible 

(4) Four factors and all two-factor interactions between any three 
of them if all other interactions are negligible 

(5) Three factors and all the interactions between them. 


3. Statistical Significance of the Effects. 


In fractional factorial designs involving as few as eight observations, 
most of the degrees of freedom are used up in estimating the main 
effects and certain two-factor interactions, and there are very few, if 
any, degrees of freedom left to obtain an estimate of the experimental 
error. This is not a serious limitation, however, because there is in 
existence a vast amount of experience on chemical, physical and bio- 
logica] experimentation, both in the laboratory and on the plant, and, 
for the vast majority of processes, fairly reliable information already 
exists on the magnitude of the experimental error which may be used 
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to assess the significance of the effects. Where no prior information 
exists on the experimental error, then it would usually be necessary to 
carry out larger designs than of eight observations in order to obtain 
information on the experimental error. 


4. Aliases in a Fractional Factorial Design. 


The aliases in a design of eight observations representing half a 
factorial design have already been given. The same method could, if 
desired, be extended to a quarter and higher fractions of a factorial 
design but this tends to become rather tedious. If it is required to 
obtain a full list of the aliases then the best method is to derive the 
defining contrasts and use the method given by Finney (Refs. 1 and 2). 
These defining contrasts may be readily derived; for example, consider 
the design for five factors in eight observations obtained from Table 1 
by equating D to ABC and E to BC, that is D = ABC and E = BC. 
Multiplying the first by D and the second by E using the rules* of 
multiplication given by Finney (1, 2), we obtain 


I = ABCD and I = BCE. 


ABCD and BCE are thus two of the defining contrasts and the third 
is their product ADE. The aliases of any effect may then be obtained 
in the usual way from the relation 


I = ABCD = BCE = ADE 


by multiplying this relation by the given effect (Finney, loc. cit.). 
In the case of 7 factors in 8 observations, the identities are:— 


D=ABC, E=BC, F=AC, G=AB 


from which we see that ABCD, BCE, ACF and ABG are four of the 
defining contrasts and the remainder are obtained by multiplying these 
in all possible combinations generating 15 defining contrasts. The 
term ‘alias’ has a special meaning, and D = ABC really means that 
D and ABC cannot be separated and the corresponding comparison 
can be used to estimate D only when ABC is negligible. More will be 
stated on this point later. 


5. Designs of Sixteen Observations. 


The method of constructing fractional factorial designs of eight 
observations may be readily applied to designs of 16 observations. 


*Theproduct of two or more effectsis that given by thelaws of ordinary algebra with the additional 
condition that A? = B? = = 1. Thus (ABC)(CD) = ABD. 
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Sixteen observations are sufficient for a complete factorial design for 
four factors each at two levels. Denote the factors by A, B, C and D 
then the complete factorial design is given by: 


(1), a, b, c, d, ab, ac, ad, be, bd, cd, abc, abd, acd, bed, abcd. 


The table of signs similar to Table 1, but in this case for sixteen ob- 
servations, may be readily constructed, thus, for each main effect, 
combinations containing the corresponding small letters are plus and 
the remainder minuses and the signs for the interactions are obtained 
as before, by multiplying together the rows corresponding to the capital 
letters contained in the interactions. The table of signs is as follows:— 


TABLE 3. 

(l) a b c¢ d ab ac be bd cd abe abd acd bed abed 
A -4+---4+ ---4+ 44+ - 4 
B --+--4+--4+ 4 -4+ 4 -4+ 4 
---+--4+ -4+-4+ 4+ - 4+ 4+ 4 
D ----+--4+ 4+ 4+ 4 
AB |+--++4----+ 4+ -- 4 
ac +4 
AD 4-4 
BC ++ -- + +4 
BD 4+ +4 
cD 4+ 4 
ABC ++ +4 
ABD ++ +4 
ACD 4 
|- - -- ++ 
ABCD|+ - ++ ---- + 


An additional factor Z may be introduced by equating it to the 
highest order interaction ABCD. This results in a half-factorial design 
with the defining relation J = ABCDE obtained from E = ABCD by 
multiplying both sides by E (Finney loc. cit.). From the way the 
design has been constructed it is seen that all main effects, all two and 
three-factor interactions between A, B, C and D may be estimated 
provided all interactions of E with A, B, C and D are negligible. From 
the defining relation we see that if all three-factor interactions are 
negligible, the design will estimate all main effects and the two-factor 
interactions between all five factors. 

Additional factors may be introduced by equating them to the three- 
factor interactions least likely to be appreciable. When all the three 
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and four-factor interactions between A, B, C and D can be assumed 
negligible, as will usually be the case, there will be several interactions 
which can be used for additional factors. With one additional factor 
E, the obvious choice is to equate it to the four-factor interaction 
ABCD. When two additional factors E and F are introduced one of 
these could be equated to a four-factor interaction and the other to a 
three-factor interaction. A better design would however, result if the 
two additional factors are equated to three-factor interactions. This 
is seen from the following considerations. The first arrangement gives: 


E = ABCD, F = BCD 
from which the following defining relation is obtained: 
I = ABCDE = BCDF = AEF (1) 
The second arrangement gives 
E = BCD, F = ACD 


with the defining relation: 


I = BCDE = ACDF = ABEF (2) 
From the last term in (1) it is seen that 
A = EF, E = AF and F = AE (3) 


whilst in (2) all main effects are clear of two-factor interactions. On 
the assumptions made that the interaction between E and F and of 
E and F with A, B, C and D are negligible, the first arrangement is 
quite satisfactory but the second arrangement gives a better separation 
of main effects and two-factor interactions. This illustrates the fol- 
lowing rule: 

If only one additional factor is to be introduced then equate this to 
the highest order interaction. If two or more factors are to be intro- 
duced then equate these first to the interactions of next to the highest 
order and when these have been used up then to the highest order 
interaction. 

It is seen from Table 3 that as many as five additional factors may 
be introduced while still preserving all the two-factor interactions 
between A, B, C and D provided the additional factors do not interact 
amongst themselves and with A, B, C and D. 


6. Complete System of Fractional Factorial Designs. 


Any given fractional factorial design represents one of a system of 
such designs satisfying the same experimental requirements, and the 
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method described above supplies one of these, but the complete system 
may readily be obtained as will now be illustrated. In a design of 16 
observations the fifth factor E was introduced by equating it to the 
four-factor interaction ABCD, that is E = ABCD. However, E could 
have been equated to —ABCD and the resulting design would be 
similar except that all the signs in the row corresponding to ABCD 
would be reversed. Both fractional factorial designs together con- 
stitute all the combinations of the complete factorial design of the 
five factors A, B, C, D and E, and the contrast between the two frac- 
tions represents the five-factor interaction ABCDE. When intro- 
ducing two additional factors say EH = ABCD and F = BCD, the sign 
of either or both may be changed giving the following system of four 
fractional factorial designs 


(4.1) E = ABCD, F = BCD 
(4.2) E=-ABCD, F = BCD 
(4.3) E = ABCD, F = —BCD 
(4.4) E=-ABCD, F-=-—BCD 


These four fractions together constitute the complete factorial 
design, and the contrasts between them represent the interactions 
ABCDE, BCDF and their product AEF. When ‘n’ additional factors 
are introduced there will be 2” suitable fractional factorial designs and 
the one to be used should be chosen at random. This can easily be 
achieved by tossing a coin to decide what sign to use for each additional 
factor. 

The reverse process of dividing a complete factorial design between 
blocks such that the contrasts between the blocks correspond to a given 
set of interactions, is the problem of confounding used extensively in 
agricultural experimentation (Ref. Yates (3)). Parallel situations arise 
in industrial experimentation, for example, it is not unusual in the 
manufacture of chemicals to encounter long and short periodic trends 
in the yield and quality of products, with the result that comparisons 
between observations carried out within a short interval of time are 
less subject to error than comparisons between observations made at 
longer intervals. The accuracy of a long term experiment may be 
increased if all important comparisons are confined to comparisons 
between observations carried out within relatively short periods of 
time. In the case of a complete factorial experiment, this is achieved 
by dividing the experiment into blocks, each block occupying a rela- 
tively short period of time, in such a way that the comparisons between 
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the blocks correspond to unimportant interactions, and in this way the 
important comparisons are kept within blocks. The interactions corre- 
sponding to the comparison between the blocks are said to be con- 
founded with blocks. 

The two fractional factorial designs of 16 observations mentioned 
above and derived by equating E to ABCD and — ABCD respectively, 
represent the two blocks in the complete factorial design of the five 
factors A, B, C, D and E confounding the five-factor interaction 
ABCDE. Again, the system of four fractional factorial designs (4.1) 
to (4.4) represent the four blocks of a 2° factorial design confounding 
the interactions ABCDE, BCDF and AEF. 

Confounding is not confined to complete factorial designs and, if 
necessary, any of the spare degrees of freedom may be used to divide 
the fractional factorial design into smaller blocks. This is done in the 


usual way (Ref. Yates, (3) p. 18) and the method will be illustrated in 
the following example:— 


7. An Example. 


This example describes the design used in one of several investiga- 
tions, on the plant scale, of the effect of various factors on the yield of 
penicillin. There are three main stages in the production of penicillin, 
namely, the production of the inoculum, the fermentation stage and 
the chemical extraction of the penicillin. The inoculum produced at 
the first stage is used in the second stage for the production of the 
penicillin. This investigation was concerned only with the first two 
stages of the process. 

There were several factors which the biologists considered likely to 
have a beneficial effect on the efficiency of the process and the following 
five were chosen for this particular investigation: 


Stage 1 Preparation of Inoculum 
A Concentration of Corn Steep Liquor 
B Amount of Glucose 
C Quality of Glucose 
Stage 2 Fermentation 
D Concentration of Corn Steep Liquor. 


There was insufficient corn steep liquor from one delivery for the 
whole design and it was decided to use two deliveries, one half of the 
design to be made from each delivery; it was therefore necessary to 
add another factor 
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E Quality of Corn Steep Liquor 


There were four similar fermenters available for this experimental 
work and so another factor is added 


F Fermenters 1, 2, 3 and 4. 


No large differences were expected between either the fermenters or 
the deliveries of corn steep liquor and therefore no interaction was 
likely between these two factors and the remainder. This part of the 
investigation was confined to one change in each of the factors, the 
change being taken in the direction considered, by the biologist, to be 
most likely to improve the yield. The changes in any case were rela- 
tively small so that no large interactions were likely to arise. As a 
safeguard it was decided to retain as many as possible of the inter- 
actions between A, B, C and D. 

A design of 16 batches was considered the very minimum size to be 
of any practical value owing to the relatively large variation known to 
arise in the biological process. This number is sufficient for a complete 
factorial design involving four factors. A, B, C and D are the more 
important factors and since these are the only factors between which 
interactions are likely to exist, we construct a basic factorial design 
with these factors. This would enable the following 15 effects to be 
estimated. 


Main effects: A, B,C, D 

Two-factor interactions: AB, AC, AD, BC, BD, CD 
Three-factor interactions: ABC, ABD, ACD, BCD 
Four-factor interactions: ABCD 


Factor E, that is the two qualities of corn steep liquor, is introduced 
by equating it to the interaction ABCD. There are four fermenters 
and to introduce these it is necessary to use up three degrees of freedom. 
This is really the standard problem of confounding the fractional 
factorial design of the five factors A to E into four blocks, each block 
corresponding to one fermenter. Two of the degrees of freedom may 
be chosen at will but the third will be the product of the two chosen 
ones. The best arrangement is two three-factor interactions and their 
product, which, unfortunately, must be a two-factor interaction. BC 
is the least likely of the interactions and therefore we chose ABD, 
ACD and BC to represent the comparisons between the fermenters. 
The method of dividing the observations into four blocks has been 
given by Yates (Ref. 3), which is: consider the signs of any two of these 
interactions, say, ABD and ACD (see corresponding rows in Table 4), 
and allocate those observations for which ABD = (+) and ACD = (+) 
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to one fermenter, ABD = (+), ACD = (-—) to another fermenter, 
ABD = (—) and ACD = (+), to the third fermenter and ABD = (-—), 
ACD = (-—), to the fourth. 

As mentioned earlier, it is not unusual in the manufacture of chemi- 
cals to encounter trends with time, often of unknown causes, and in 
order to eliminate as much of this trend as possible it is desirable to 
introduce another system of confounding between four periods of times. 
To do this we use up the remaining two three-factor interactions ABC, 
BCD and their product AD. The comparison between the four times 
least likely to be appreciable is (t, — tf.) — (t; — t,) and so this particular 
comparison is equated to the two-factor interaction AD. This is 
achieved by considering the signs of ABC and BCD for the purpose of 
allocating the observations into the four blocks, thus ABC = + and 
BCD = + will fall in time block 1, ABC + and BCD (-—) in time 
block 2, etc. The resulting design is then:— 


TABLE 4 
Batch Number 
Fer- 
Time 
This design may also be written in the following form:— 
TABLE 5 
Fermenter 
Time sequence 
1 2 3 4 
abcde b ade 
a cde bde abc 
d ace abe bed 
bee abd acd e 


be 
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This is the half factorial design ABCDE confounded between four 
fermenters and again between four times. The batches in each time 
block are made at approximately the same time and so this design 
practically eliminates the effect of time trends. 


8. The Use of Fractional Factorial Designs in Sequence. 


An important feature of most industrial experiments is that the 
observations are made in sequence, either singly or in sets of a few at 
a time, so that the observations of one set become available before the 
next need be started. Moreover, the time interval between the sets of 
observations is usually short, often a matter of days or even hours. 
In this way industrial experiments differ markedly from agricultural 
field experiments for which a whole year must elapse between successive 
trials. For this reason agricultural field experiments must be self- 
contained resulting in rigid complex designs. Once a field experiment 
in agriculture has been started it is not usually possible to change or 
modify the design but in most industrial work a high degree of flexibility 
exists because the situation may be reviewed after every observation 
or set of observations comes to hand. It is not necessary to adhere 
strictly to the design drawn up at the outset of an experiment but the 
design may be modified as the result of information gained from the 
earlier observations. 

Assume that a complete factorial design has been carried out and 
the change in one or more of the factors produces a large effect on the 
yield of a certain product. In such a case appreciable interactions are 
likely to arise between this factor and the others so that only one half 
of the design is of use for assessing the effects of the other factors. 
If the effect of the factor is a large reduction in yield of product, the 
loss in yield on the plant scale would represent a large financial loss. 
Since a factor with a large effect can usually be detected with relatively 
few observations, much of the experimenter’s time will have been 
needlessly wasted. If, instead of a full design, only a half-factorial 
design had been carried out, the loss would have been reduced to half, 
and for a quarter replicate, the loss would have been still further 
reduce and so on. If, on the other hand, no single factor produces a 
large effect, nothing is lost by carrying out a fractional factorial design 
initially because, if this design does not give sufficient information, we 
can follow on with other fractional factorial designs of the same system 
until sufficient precision is obtained. 

The procedure to be followed when investigating a number of 
factors, even when a full factorial design may ultimately be required, 
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is to divide the factorial design into blocks confounding higher order 
interactions and carry out these blocks in succession, examining each 
block after it is completed. If the first block gives all the information 
required then other blocks need not be carried out. Another possibility 
is that one or more of the factors may give a large effect in which case 
all further work would be confined to the more favourable levels of 
these factors. The experiment is then redesigned with fewer factors 
or, if required, with other factors to replace the ones dropped after the 
first set. Examples appear in a paper by Floyd (11). 


9. Separation of Aliases. 


One purpose of continuing the experiment with further fractional 
factorial designs in the same system may be to obtain greater precision 
in the estimates of the main effects and certain interactions, but the 
addition of a second fractional factorial design to the first also has the 
effect of separating the aliases. For instance, in a quarter factorial 
design, each degree of freedom represents a group of four effects, that 
is to say, each effect has three other effects as aliases. Now two quarter 
factorial designs together form a half factorial design for which each 
effect has one alias. The second fractional factorial design has thus 
produced a separation of the aliases and twice the number of effects 
can be independently estimated. There are usually several designs to 
choose from for the second set, and different pairs of fractional factorial 
designs in the same system produce different types of separation of the 
aliases. Consider for example a design of eight observations covering 
five factors A, B, C, D and E derived by equating D and E to the 
interactions ABC and BC respectively. There are four possible designs 
of this type as shown earlier (section 6) and these are as follows:— 


(5.1) Defining relation: = ABCD = BCE = ADE 


E = BC 

(5.2) = —ABCD = BCE = —ADE 
(5.3) = ABCD = —BCE = —ADE 
(5.4) —ABCD = —BCE = ADE 


The aliases for each of the above fractional factorial designs can be 
derived in the usual way (Finney, loc. cit.); for example, the aliases for 
the main effect A in the above designs are, respectively, 
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(6.1) A = BCD = ABCE = DE 

(6.2) A = —BCD = ABCE = —DE 
(6.3) A = BCD = —ABCE = —DE 
(6.4) A = —BCD = —ABCE = DE 


It is necessary to know precisely what is meant by an ‘alias’; this 
term now in general use, may be misleading because A = BCD does 
not mean that A is equal to BCD but that A and BCD cannot be esti- 
mated separately, and the corresponding degree of freedom does in 
fact measure (A + BCD), that is, the sum of the two effects. In the 
above system of fractional factorial designs it is easy to see that the 
group of effects corresponding to the main effect A are respectively:— 


(7.1) A'+'BCD + ABCE+DE (=) 
(7.2) BCD + ABCE-—DE  (=2,) 
(7.3) A+BCD-—ABCE-—DE 
(7.4) A— BCD -ABCE+DE (=2,) 


Denote the magnitude of these effects by x, , 22 , 7; and 2x, respec- 
tively. If we combine (7.1) and (7.2) we see that the sum gives 
2(A + ABCE) and the difference gives 2/;BCD + DE). We have, 
therefore, separated: A from the two-factor interaction DE, and A has 
a four-factor interaction as an alias. The combination (7.1) and (7.3) 
also separates the main effect A and the two-factor interaction DE, 
and gives 2(A + BCD), 2(ABCE + DE). This however, is not quite 
so good because A now has a three-factor interaction as an alias. 

When a further fractional factorial design is carried out in addition 
to the first two, a still further separation of the aliases occurs. For 
example, assuming that designs (5.1), (5.2) and (5.3) have been carried 
out, then if the same effect group as previously is considered, it is seen 
that the three effects A, DE and BCD have been separated, each having 
the four-factor interaction ABCE as its alias. Thus, from (7.1) and 
(7.2) we obtain 2(A + ABCE), from (7.1) and (7.3), 2(DE + ABCE), 
and from (7.2) and (7.3), 2;3BCD — ABCE). It must be noted that 
only two thirds of the observations are used for these estimates and 
the conclusion is that if A, DE and BCD are to be separated, then this 
can only be done with an efficiency of 663%. When the experimental 
error is not large, this loss of efficiency is more than offset by the in- 
formation gained on the other interactions. If, however, the three- 
factor interactions in addition to the four-factor interactions can be 
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assumed zero and only estimates of A and DE are required from rela- 
tions 7.1—7.4 of the ? factorial design, then putting BCD = ABCE = 0 
the following expressions are obtained:— 


(8.1) A+ DE =x, 
(8.2) A-—DE 
(8.3) A — DE = 2;| 


Averaging the Jast two expressions gives:— 

A — DE = + 2;)/2 
whence A = 32, + + 
and DE = 32, — 12, — 432, 


Since each z is a difference between two means of four observations 
each, the variance of these estimates is 3[(3)? + (3)? + (3)’Jc° = 
30°/16. The smallest possible variance for an estimate of one effect 
from 24 observations is o°(1/12 + 1/12) = o*/6, therefore, the efficiency 
of the estimates of A and DE in the (2) factorial design assuming that 
all three and four-factor interactions are zero, is 888%. When of course, 
all interactions other than those between A, B and C can be assumed 
negligible, then all main effects and the two-factor interactions between 
A, B and C can be estimated with 100% efficiency in the (#)-factorial 
design. When the fourth set is carried out, the factorial design is com- 
plete and all effects can then be separated and estimated with 100% 
efficiency. The above considerations may be readily applied to all the 
other main effects B, C, D and E and the interactions AB and AC. For 
example the effect groups containing interactions AB are:— 


(9.1) AB + CD + ACE + BDE 
(9.2) AB — CD + ACE — BDE 
(9.3) AB + CD — ACE — BDE 
(9.4) AB — CD — ACE + BDE 


The first two fractional factorial designs will separate the two- 
factor interactions with 100% efficiency. The first three will separate 
the two-factor interactions and one three-factor interaction, the other 
being assumed zero. This can be done with 663% efficiency. If all 
three-factor interactions are zero, the first three fractional factorial 
designs will estimate AB and CD with 88§% efficiency. 

Similar considerations may be applied to any sequence of fractional 


aa 
| 
: 
We 
‘ 


USES OF FRACTIONAL FACTORIAL DESIGNS 249 


factorial designs. This method gives effectively a complete and simple 
solution for all combinations of fractional factorial designs belonging to i 
the same or different systems of confounding for the 2” class of factorial a 
designs (Refs. 6, 10). 

Acknowledgments are due to Mr. G. E. P. Box with whom the 
authors have had frequent discussion on the use of fractional factorial ie 
designs. 
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SOME REMARKS ON ANIMAL POPULATION DYNAMICS 


P. A. P. Moran 


Institute of Statistics, Oxford University 


a PRESENT PAPER is not a survey of the vast field of investigation 
into the dynamics of population change but an attempt to discuss 
a few specific problems in the hope of suggesting some new lines of 
investigations to those better qualified than the author to carry them 
out. Any investigation into population dynamics must necessarily 
involve mathematics in some way or other, and it is convenient and 
perhaps illuminating to begin by classifying the various ways in which 
mathematics is applied to biological problems of this kind. Moreover, 
such a classification will be found to hold equally well for the applica- 
tions of mathematics to economics. 

Without begging any philosophical questions, we may distinguish 
between ‘‘a priori’ and “a posteriori’ methods. In a priori applications 
of mathematics to biology, an attempt is made to describe the main. 
features of a biological system in terms of a mathematical model in- 
volving a set of equations. From this model, we then try to make 
deductions which can be verified by observation or experiment. Such 
deductions need not be numerical (if they are, they are usually very 
difficult of verification) but they may often be qualitative—e.g., ulti- 
mate extinction of a population, existence of oscillations, and so on. 
The difficulty and dangers of taking into account all the factors in a 
situation without making the mathematical model very complicated are 
obvious, yet the more complicated the model, the more difficult is it of 
verification. 

Such models may be further classified according to whether they do 
or do not involve probability, and are then described as ‘“‘probabilistic” 
and ‘deterministic’ respectively. Examples of the latter are the 
equations of Volterra and others describing predator-prey relationships’ 


1This theory is described as deterministic because once the constants and the initial values of the 
population densities are given, the development of the situation is determinate. Probability theory 
may be used to determine the functional relationships in the model e.g. by the consideration of “‘ran- 
dom encounters”, etc. 
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and the matrix theory of a single population given by P. H. Leslie 
(1945), (1948). An example of a “probabilistic”? model is provided by 
the evolutive stochastic processes studied by Feller (1939), D. G. 
Kendall (1947), (1948a), and (1948b), and others. 

Alternatively we may begin a study of animal populations from 
actual numerical data: we must then use statistical methods. Here 
again we may eschew probability, and using only those statistical 
methods which are of descriptive value, but this does not take us very 
far. In order to estimate properties of our populations and to assess 
the provision of these estimates, we must introduce probability ideas 
and use the inferential techniques of statistical science. 

Leaving aside the large field of sampling methods for the estimation 
of population density (e.g., the theory of capture—re-capture methods) 
which is obviously of great methodological importance to ecologists, 
an example of this type of problem arises from the statistical analysis 
of population cycles. One of the best sets of records showing cyclic 
changes are those on the trapping of lynx in Canada for which figures 
are given by Elton and Nicholson (1942). Perhaps the most suitable 
procedure for the analysis of such data would be to suppose these figures 
generated by some kind of stochastic process. The constants involved 
in such a model could then be estimated. Elton and Nicholson give 
the numbers of lynx caught in various regions of Canada over a long 
interval of time. These numbers show remarkably stable oscillations. 
The logarithms of the numbers have oscillations which are reasonably 
symmetrical about their mean, and an autoregressive scheme could be 
fitted. Thus if u, is the logarithm of the numbers of lynx caught in 
year t, in one of the regions considered, we might try to find out whether 
they would be reasonably fitted by a scheme of the type 


Up = + + (1) 


where a and b are constants and the e, are random terms, with zero 
means, which are serially uncorrelated. This is best done by fitting a 
and b by the method of least squares. The regression of u, on more 
than the two previous terms could of course be considered. Equation 
(1) may then be used to predict what will happen in the future. The 
standard error of such a prediction can be calculated but increases as 
the epoch of the predicted value moves into the future. An unsolved 
statistical problem of great difficulty is to devise a satisfactory test for 
correlation between two such series. The ordinary test of significance 
for a correlation coefficient cannot be applied (Moran, 1949). These 
types of problem are also of great interest in econometrics, and much 
theoretical and empirical work remains to be done on them. 
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The dangers involved in constructing an a priori model of a bio- 
logical situation are apparent, but we must not go to the opposite 
extreme and conclude that such methods are valueless. Not only may 
they suggest further lines of experimental investigation, but, if we give 
up hope of constructing valid quantitative theories, ecology will remain 
a collection of isolated facts. 

Consider now the problem of explaining why cyclical behaviour 
does occur in animal populations. All or nearly all the mathematical 
theories constructed to account for these observations depend on the 
interaction between two or more populations of animals and undoubt- 
edly this must play a major part in most observed cycles. In the 
present paper we shall consider that may happen in a single population 
independently of changes in other populations. 

We suppose that the density of population is n, the number in 
some well defined region. When we neglect the probabilistic aspect of 
the problem there is no objection to regarding this as a continuous 
variable. Then the rate of increase of n is dn/dt, and, ignoring migra- 
tion (or supposing its net effect to be zero), this must equal the number 
of births minus the number of deaths. These in turn are Bn and Dn 
where B and D are defined as the birth and death rates. We then have: 


d(logn) _ldn_ 
dt (2) 


If B and D do not depend on n, 1/n dn/dt will be a constant (apart 
from random fluctuations in the environment), and the population will 
either increase indefinitely or extinguish itself.* Thus either B or D 
or both must be density dependent for the population to continue to 
exist indefinitely in some state of equilibrium (using that word, for 
the moment, in the loosest sense). Hence, factors controlling the level 
of a population must be density dependent. This statement will 
continue to hold even if we make (2) into a probabilistic model by 
(say) letting B and D be dependent on such random phenomena as 
weather. We therefore assume B and D to be single-valued functions 
of n, and write them B(n) and D(n). We then have: 


= n{Bln) — Din)}. 3) 


We now assert that if B(n) and D(n) depend on n and not on any other 
characteristics of the population, oscillations cannot occur. For sup- 


*This ignores the possibility that B = D, which will not happen in practice because of random 
fluctuations, 
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pose they did. Then at some times the population would be increasing 
and at others decreasing and it would be possible to find two times at 
which the densities were equal but in one of which it was increasing 
and one decreasing. From (3) this is impossible. To construct a 
theory which will explain cyclic behaviour we must modify (3). Be- 
fore we discuss how this is to be done, we notice that the whole argument 
depends on the meaning of the terms involved and not on any empirical 
facts. To know what does in fact control population density and what 
is the mechanism of the production of cyclic behaviour will require 
much empirical investigation. 

Now consider in what way it is possible to modify the model de- 
scribed by (3). We may suppose that (B(n) — D(n)) is dependent 
also on the density of some other population whose own rate of increase 
depends on n. This could occur if the other population was parasitic 
on the first, or in competition’ with it, and thus we get theories asso- 
ciated with the names of Volterra, Nicholson and Bailey and others. 
Alternatively, we may suppose that (B(n) — D(n)) depends not only 
on 7 but also on the age distribution. Models may be constructed 
mathematically, either by using integral equations (Lotka) or in terms 
of matrix theory (Leslie 1945). In these theories, if the birth and death 
rates are age specific but do not depend on n or the past history of the 
population, the only oscillations which can occur are soon damped 
out. To construct a theory which would account for self-sustaining 
oscillations in a single population, therefore, we must suppose that 
(B(n) — D(n)) is also dependent on the past history of the population. 
We may call this dependence “‘hysteresis’’. 

As an example, consider a highly simplified model of a vole popu- 
lation. The role of hysteresis in this model will be considered later. 
The vole has a life of approximately one year. At the beginning of 
spring the population density is low. The numbers increase during the 
summer and decrease during the winter. If the number in a given area 
is measured at the same time every year, successive values generally 
increase for several years until there is decrease, known as the “‘crash’’, 
which usually occurs during one year only. The cycle then repeats, 
and has a period of the order of four years. Suppose that the densities 
at the beginning of spring are n, , n. , --- these being the numbers in 
a given area. Suppose we assume a crudely deterministic model such 
as will result in the density n;,., being some mathematical function of 
n, alone (and thus not of the previous n’s or of the previous history of 
the population up to the spring of year 7). We write this function 
N+, = f(n,). It is natural to assume that as n, increases from zero, 
N+, Will also. But if f(n;) > n; for all n; , the population will con- 
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tinually increase and if f(n;) < n; for all n; , continually decrease. 
There must therefore be at least one point on the curve at which 
n; = f(n,;). We shall see later that, for self-sustaining oscillations to 
be set up, f(x) must have its slope negative at this point and an increase 
of n; must result in a decrease of n;,; . We therefore assume that the 


FIGURE 1 


curve is of the form shown in Fig. I. and that, as x increases, f(x) in- 
creases up to a maximum and then decreases continually without 
becoming zero. Let z,, be the value at which f(z) has a maximum. 
With the possible exception of the initial value of the sequence, all 
the n; < f(z,,) and so f(n) need not be considered for values of x greater 
than f(z,,). Starting from any initial value n, we obtain successively 
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= f(n). 
= f(n2). 
The behaviour, oscillatory or otherwise, of the series n, , nm. , --- will 


be determined by n, and the function f(z). 

Let n, be the value of nm (assumed unique) for which n = f(n). 
This is represented by the point where the curve y = f(x) crosses the 
line y = x. If the initial value equals n, , so will all subsequent values 
(ignoring the effects of random fluctuations and external effects). We 
may call such a process “stationary” because, although the population 
density rises and falls during the year, it does not change from year 
to year. As stated above, observed vole populations do not behave 
like this at all but show a more or less steady rise during several years 
followed by a sudden decrease known as the “crash”. With rare 
exceptions, this crash occurs in one year only, and we do not have 


Nise < Misi KM 


This phenomenon is also reproduced in the model, for 
= fim) nN, 


implies n, > n, and n. < n, and therefore n; = f(n2) > no. 

Next consider the stability of the stationary process where the 
population density is constant and equal to n,. Suppose that initially 
there is a small disturbance so that n, = n, + 6n, where 6n is small. 
Then 


= f(r. + dn.) 


= + in f'(n,), approximately. 
For stability we must have 


ing 


where én, = — n, = 6n,f’(n,) 
and we must therefore have | f’(n,) | < 1. 
If this is true, there will be a region around n, such that if , is in this 
region subsequent n’s will converge to n, . We then say that such a 
process is “‘stable’’. 

Though f’(n,) < 1 (because y = f(x) crosses the line y = x from 
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above), it is not impossible that f’(n,) < —1. No stable population, 
stationary from year to year, is then possible, and the population must 
oscillate. No “optimum level” of population density can then be 
defined. The oscillations are bounded above and below, because f(x) 
has a maximum f(z,,) and, whatever all subsequent values (with the 
possible exception of a few immediately following n,) must be between 
S(f(am)) and f(z,,). The model thus looks like giving the kind of 
oscillations which are observed. 

We may proceed further and consider oscillations which are definitely 
cyclic, as in a process for which there is a cycle of 


values 
where <m< >n. 


This process involves only a single crash (from n, to n,) in each cycle. 
If we write f,(z) = f(--- f(f(x))---) where the f occurs p times, 


= f,(m). 


The cycle can be shown to be stable if 


<1 


zeny 


which is equivalent to 
| f’(m) f'(m,) | < 1. 


This condition for stability in a process of this kind occurring in 
in economic theory has been given by Leontief (Sammuelson (1947) 
p. 391). That stable cycles can exist with suitable chosen f(x) can be 
easily shown mathematically. I have carried out numerical and 
mathematical investigations on a number of such functions which 
verify this conclusion, but, in the present state of knowledge of what 
actually occurs, a description of them would scarcely be of value to 
the ecologist. In principle the shape of f(x) might be estimated from 
experimental observations, but considerable difficulties will arise. One 
of these is that both the observations (x, y) where y is a presumed value 
of z will probably be subject to experimental error. This will have 
the consequence that it will be difficult to establish the slope of f(x) 
with any certainty. 

We have not proved that there do not exist stable cyclic solutions 
in which two or more distinct crashes occur: thus we might imagine a 
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but n,; , m2, ms; and nm, are all unequal. Criteria for the existence of 
such cycles in terms of f(x) are not known. 

This theory depends essentially on the assumption that a high 
density at the beginning of spring has such an adverse effect on the 
death and (or) birth rates that an unduly low density the following 
year must result. Thus we can choose n, , n{ such that ny < n{ but 
N, = f(n,) > nj = f(n.). If we follow the course of the two densities 
during the year, there must be a point where they are equal but their 
rates of change unequal. This implies that the birth and (or) death 
rates of the two populations, at this point, cannot be solely dependent 
on the density and that there must be some kind of historical de- 
pendence or hysteresis. That such a phenomenon is necessary to 
explain oscillations arising from a purely intra-specific origin has already 
been pointed out above. It has been discussed (mainly with reference 
to interspecific relationships) by Solomon (1949) under the name of 
“Suppression”. Some empirical evidence that it can occur in laboratory 
insect populations can be deduced from the experiments of Pearl, 
Miner and Parker (1927). 

Some inadequacies in the above theory must be pointed out. We 
have assumed that future changes in population density are uniquely 
determined by the density at the beginning of spring, and thus that 
there is no hysteresis effect across this point. This is probably untrue 
as, for example, the density during the previous winter may have some 
long term effect on the reproductive capacity of the survivors. If this 
is important, we must either find a function f(x) in terms of some other 
point in the cycle across which the hysteresis is assumed to be negligible 
(for example the end of summer) or construct a more complicated 
model. 

Another defect in the theory is the neglect of the age distribution. 
If the death and birth rates are not only age specific but also such that 
their age-specific dependence is itself dependent on the population 
density, it would not be difficult to construct a mathematical theory 
which would result in self-generated oscillations. In such a case, the 
“hysteresis” effect arises in part from the changes in the age distribution. 
In practice both these assumptions are probably true, but, in default 
of any method of estimating their importance, it does not seem at 
present worth while setting up a mathematical model which would be 
much more complicated than the above. In the meantime it is to be 
hoped that further experimental work will be done and that the above 
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mathematical discussion may direct attention to some of the more 
important factors. A mathematical scheme of the kind considered in 
the present paper may also be applicable when we consider successive 
generations instead of successive epochs of time. Some interesting 
experimental results along these lines have been described by Nicholson 
(1950). 

I am much indebted for very helpful suggestions and criticisms from 
members of the Oxford University Bureau of Animal Population, who 
are not to be held responsible, however, for any opinions expressed 
above. 
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Corrigenda: “Fitting a straight line when both variables are subject to 
error” by M.S. Bartlett, Biometrics, 5, 207-212, 1949. 


(i) At the top of p. 210, the last factor in the formula should read 


2 2 4 
x} instead of 2 3 


(ii) In the formula for ¢ on p. 211, the factor 


2 4 2 4 \* 
+ 3 should read + x} 
The author is indebted to J. M. Cameron for pointing out these errors; 
the numerical example is, however, based on the correct formulae. 
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COMPOUND SYMMETRY TESTS IN THE MULTIVARIATE 
ANALYSIS OF MEDICAL EXPERIMENTS* 


D. F. Voraw, Jr. 
Yale University, New Haven, Conn. 


A. W. KimBa.t Anp J. A. RAFFERTY 


USAF School of Aviation Medicine, Randolph Field, Texas 


Summary. This paper presents tests of several statistical hypotheses 
which assert that experimental quantities are “ ‘stable’ with respect to 
time’. More generally, the hypotheses assert that experimental 
quantities satisfy certain “symmetry” conditions (see sections 1, 2). 

Two of the statistical tests can be used to test hypotheses about 
“row effects” in the Analysis of Variance. However, these hypotheses— 
unlike the conventional Analysis of Variance hypotheses—do not require 
that the experimental quantities be uncorrelated nor that there be homo- 
geneity of variances (see section 3). 

An experiment in connection with investigations of cold injury is 
described. Tests of compound symmetry are employed in the analysis 
of the results. Readers who prefer non-mathematical discussions will 
find this illustrative example most informative. (See section 4). 

In the Appendix the hypotheses are stated in full generality and 
expressions for the corresponding sample criteria are given (see sections 
A.1, A.2). For various special cases, the means, variances and exact 
cumulative distributions of the criteria are given together with x’- 
distributions that approximate the exact distributions (see section A.3). 


1. Introduction. Occasionally the medical experimenter needs to test 
statistically whether the experimental quantities are ‘ ‘stable’ with 


*This paper is an expansion of a 15-minute paper (see Abstract No. 69 in [6, p. 81]) presented 
December 27, 1948, at a meeting in Cleveland, Ohio, of the American Statistical Association. 
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respect to time’. The notion of such “stability’’* requires interpreta- 
tion; and in this paper two interpretations will be considered. In each 
it is presupposed that the observations have a normal multivariate dis- 
tribution; thus a condition of stability can be expressed merely in terms 
of the true means, standard deviations or coefficients of correlation 
associated with the multivariate distribution. 

Suppose that for each member of a sample of experimental animals 
% CO, in blood is measured at each of two times, 7; , T, , and hematocrit 
is measured at each of three times, Tj , T; , T; . Let X, , Xz be the 
two CO, measurements and X, , X, , X; be the three hematocrit 
measurements; also assume that (X, , X. , X; , X,, X;) has a 5-variate 
normal distribution. Let the true mean and standar¢ deviation of 


- X; be m; and o; , respectively, and the true coefficient of correlation 


between X,; and X; be p,;; . The assertion that there is “stability with 
respect to time” could be interpreted as follows:** 


(1.1) 


Pis = Pia = Pis = Pos = Pos = fos; ANd pz, = fas = Pas - 


What is stated in (1.1) can be expressed as follows: for the two % CO, 
measurements the true means are equal and the true standard devia- 
tions are equal; for the three hematocrit measurements the true means 
are equal, the true standard deviations are equal, and the intercorre- 
lations (between distinct measurements) are equal; and the intercorre- 
lations between % CO, measurements and hematocrit measurements 
are equal. 

When the two sets of times of measurement are the same, there is 
another relevant interpretation which takes into account whether the 
measurements are simultaneous. For example, suppose that % CO, is 
measured at two times, 7, , T, (as before), and hematocrit is measured 
at only two times, Tj and T3 , and Tj = T, and Tj = T, ; then “sta- 
bility” could be taken to mean:t 


= M;, Ms = Ms O25 Cz 
(1.2) 
Piz = Pas and Pis = Pog 


*The hypotheses to be considered will first be discussed in terms of “‘stability’”. An alternative 
illustration will then be given (see (1.3)). 

**See (2.1) for a restatement of (1.1). 

+See (2.2) for a restatement of (1.2). 

If simultaneity is taken into account and the sets of times of measurement are not identical but 
have at least one time common to two or more sets, then the situation is obviously a ‘‘mixture” of (1.1) 
and (1.2); however, such ‘‘mixed”’ cases will not be considered in this paper. 
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What is stated in (1.2) can be expressed as follows: for the two % CO, 
measurements the true means are equal and the true standard deviations 
are equal; for the two hematocrit measurements the true means are 
equal and the true standard deviations are equal; the intercorrelations 
between % CO, and hematocrit measured simultaneously are equal; 
and the intercorrelations between % CO, and hematocrit measured at 
distinct times are equal. 

An illustration of the hypotheses will now be given in a situation in 
which the experimental quantities do not explicitly involve time. 
Suppose the anterior and posterior muscles of both hind legs of rabbits 
are to be weighed (see section 4). Let X, , X, , X; and X, be the 
anterior left, anterior right, posterior left and posterior right hind leg 
muscle weights, respectively. It is assumed that (X, , X. , X3 , X4) 
has a 4-variate normal distribution. In comparing the right and left 
legs, the following hypotheses might be of interest. 


m, = M3; O1 = 62; 63 = and 
(1.3) 


Mis = Pic = Pos = Pre - 


(1.3) asserts that: (a) the anterior muscle weights have the same means 
and the same variances; (b) the posterior muscle weights have the 
same means and the same variances; (c) the intercorrelations between 
anterior and posterior muscle weights are all equal. The experimenter 
might feel, however, that there are too many conditions in (c) and 
prefer the following hypothesis: 


m = M OF 63, %; 
(1.4) 
Piz = Poa and Pis = P23 - 


(1.4) asserts (a) and (b) above and the following two conditions: (1) the 
correlation of the two left-leg muscle weights equals the correlation of 
the two right-leg muscle weights; (2) the (left-leg anterior)-(right-leg 
posterior) correlation equals the (right-leg anterior)-(left-leg posterior) 
correlation. 

The purposes of this paper are: (i) to present, in a simple form, 
sample criteria for accepting or rejecting hypotheses of the sort in 
(1.1), (1.2), (1.3), and (1.4), and for certain generalizations of such 
hypotheses (see A.l in the Appendix); (ii) to give means, variances 
and distributions of the sample criteria (when the corresponding 
hypotheses are true) for various special cases; and (iii) to indicate the 
relation between two of the hypotheses, denoted by H,(m) and H,(m), 
and certain hypotheses in the Analysis of Variance (see section 3). 
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Neither time nor a biological system need be involved in a situation 
to which the hypotheses are relevant; e.g., in Psychometrics a hy- 
pothesis of the sort stated in (1.1) can be used to test statistically 
whether three different forms of a psychological examination are inter- 
changeable and have equal validities regarding both of two criterion 
measures (see [2, pp. 447-448]). 

2. Restatements of Hypotheses. Inspection of (1.1) and (1.2) shows 
that in either interpretation of “stability” not only the number of 
quantities to be measured but also their grouping is important. Similar 
remarks apply to (1.3) and (1.4). Five quantities, X, , --- , X35, 
are involved in (1.1) and they fall into two mutually exclusive groups: 
(X, , X.), (X3 , X4 , Xs). In (1.2), (1.3), and (1.4) four quantities, 
X,,X., X;, and X, , are involved, falling into two mutually exclusive 
groups: (X, , X.), (X; , X,4). Convenient notations for these two 
groupings are: (2,3) and (2,2), respectively. (See A.1 in the Appendix 
for a full discussion of grouping). For the case in which there is only 
one group see [1]. 

Let H,(mvc) be the hypothesis stated in (1.1), (1.3), or any of 
the generalizations of (1.1) or (1.3) (see A.1 in the Appendix). Let 
= = o,0;p;; ; || || is the variance-covariance matrix of the 
chance quantities; for example in (2.1), below, A** = o203p23 , A” = a3. 
For the grouping (2,3) H,(mvc) may be stated as follows: 


m, = = M = Ms and 


a’ a” B C!D D D 

A" a™ C BID DD 
— (2.1) 

D D:E F F 

D DiF EF 

AP D D'F F E 


(2.1) is simply a restatement of (1.1). The broken lines in (2.1) indicate 
the grouping of the rows and columns in the variance-covariance matrix. 

In certain cases we might wish to ignore the conditions on the 
means and assert merely that || A‘’ || has the form given in (2.1)*. 
This less restrictive hypothesis is denoted by H,(vc). In some situa- 


*or in a generalization of (2.1)(see A.1). 
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tions we might take H,(vc) as given and merely assert that the means 
satisfy the conditions given in (2.1); this third hypothesis is denoted 
by H,(m). 

Let H,(mvc) represent the hypothesis in (1.2), (1.4), or any of the 
generalizations of (1.2) or (1.4) (see A.1). H,(mvc) may be stated 
as follows: 


m= M=m; and 
=|} (2.2) 
A> A> Am T U K 


We shall also consider hypotheses H,(vc) and H,(m), which have to 


H,(mvc) the same relations, respectively, that H,(vc) and H,(m) have 
to H,(mvc). 


For the grouping (2,2) H,(mvc) may be stated as follows: 


m= and 


R Siw w 
A" W WiT U 
A" A® AM 


(2.3) is simply a restatement of (1.3). H,(vc) and H,(m) can be form- 
ulated easily for the (2,2) grouping by means of (2.3), e.g., H,(vc) 
asserts merely that the variance-covariance matrix has the form given 
in (2.3). 

A normal multivariate distribution for which one or more of the six 
hypotheses H,(mvc), H,(vc), --- , H,(m) is true, is said to have “‘com- 
pound symmetry”. The symmetry is of Type I or Type II according 
as an H or an H hypothesis is applicable.* The sampling theory of 


*The hypothesis of Type II symmetry was formulated on the basis of a description by J. Allan 
Rafferty, M.D. (USAF School of Aviation Medicine, Randolph Field, Texas) of the need in medical 
research for testing statistically whether a biological system is ‘stable’. 
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criteria for hypotheses of compound symmetry is given in [2]. The 
sampling theory of criteria relevant to the case where there is only one 
group of variates is given in [1]; in this case the two types of symmetry 
are identical. = 

Sample criteria for the hypotheses H,(muc), H,(vc), --- , H(m), 
are represented by L,(mvc), L,(vc), --- , L,(m), respectively.* In 
section 4, L,(muc), L, (vc), and L,(m) are given explicitly for the grouping 
(2,2) as functions of the sample means, sums of squares and sums of 
cross products. 


3. H,(m), H,(m) and a Hypothesis in Analysis of Variance. A 
sample Oy (Xia , Xoa, *** » Xta) (2 = 1, --+ , N) Of size N can be 
exhibited in a tzN array as follows: 


Xa Xoe Xoy 


5 (3.1) 


Xn, Xie, 


Each column, , , *** in (3.1) will be considered as a 
normal f-variate chance quantity, where the ¢ variates may be corre- 
lated (a = 1, --- , N); any two X’s in different columns will be assumed 
to be uncorrelated. In 3A, 3B, 3C, below, tests of hypotheses about 
“row effects” will be considered. 3A contains merely a discussion of a 
“row effect”? test when the observations are uncorrelated; 3B deals 
with the case where for all columns there is a common correlation, p, 
among elements within a column; 3C deals with a generalization of 3B. 
The hypothesis treated in 3B is similar to H,, in [1]; the generalization 
in 8C is based on hypotheses similar to H,(m) and H,(m). 


3A. Analysis of Variance Test for Equality of Row Effects. 

In one of the approaches to the Analysis of Variance (see [5, pp. 
177-179]) it is assumed that the distribution of X,, is normal with 
true mean u + 7; + c, and variance (¢ = 1, --- ,t,a=1,---,N), 


*The notation is the same as that in [2], wherein the subscript 1 in H; indicates that only one popu- 
lation is under consideration. 
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and that the X’s are uncorrelated. The 7’s and c’s represent row and 
column effects, respectively, and wu is an effect common to all rows and 
columns. In such cases, the usual null hypothesis tested is that the 
row effects are equal, i.e., 7; = 7. = +--+: = 7,. For this test, the ap- 
propriate statistic is the variance ratio, F. (See (c) p. 179, in [5)). 


3B. An F-test for Equality of “Row Effects’ When There Are Correlations. 
A test for equality of row effects can be made even when there are 
correlations among some of the X’s in (3.1). In fact, the same F-test 
referred to above is still valid, under the following conditions: the 
true mean of X;,,isu+r; +c. = 1,---,t}a = 1, --- , N); the 
true variance of X;, is o’ for all 7 and a; the true correlation between 
and is p for all 7, j, a; and X;, and X;,- are uncorrelated 
(7, j =1,---,& a, a’! = a’); (Xie ,Xeq) hasa 
normal t-variate distribution (a = 1, --- , N). The null hypothesis, 
say is thatr,; = r2 = = 7r,. does not presuppose that the 
column effects are equal; however, if they are equal, then H/, reduces 
to H,, in [1]. The statistic LZ, , which equals (V — 1)/(N — 1+ F) 
where /" is the variance ratio (see [1, pp. 262, 265]), is a test of H,, . 
When H7, is true, the distribution of L,, (also F) can be shown to be 
independent of the column effects; therefore L,, 7s a test of H}, . 


3C. Test for Equality of Within-Group Row Effects When There Are Corre- 
lations. 


Assume there is a grouping* (m, , «++ , m,) of the ¢ rows (¢ = n, + 
+ n,). With the a-th group (a = 1, , we associate a “group 
row effect”, R, ; within the a-th group there are, say, n, rows, which 
may be represented by 1, 2, --- , n, —and with row z, we associate a 
‘within-group row effect’, 7... . The statistic L,(m) (see section 4 
and A.3) can be used to test for equality of within-group row effects 
for each group—i.e., to test that for every a, rar = Tan = *** = Tane- 
Use of this test presupposes that: (A) the true mean of X;, is u + 
,Nja=1,-:-- 
1, --- , m,); and (B) for every a (X,., °°: , Xt) has a ¢-variate normal 
distribution with a common variance-covariance matrix having the 
pattern of symmetry in (A.1). The null hypothesis is that for every 
Q@, Ta = Tar = *** = Tang - When (A), (B), and the null hypothesis 
hold, it can be proved (by methods in [2]) that the distribution of 
L,(m) is independent of the column effects c, , --- , cy and is therefore 
the same as the distribution of Z,(m) given in [2]—moreover, given 
that (A) and (B) hold and c, = c. = --+ = cy , the hypothesis of 


*(See A.1 for a discussion of grouping; also see the footnote after A.4), 
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equality of within-group row effects and the hypothesis H,(m) are 
identical. 

The statistic L,(m) (see A.2) can be used to test for equality of 
within-group row effects when the grouping is (n")* (t = nh). The 
assumptions on which the test is based are: (A) (as above); and (B’) 
for every a(X,, , °** , Xta) has a t-variate normal distribution with a 
common variance-covariance matrix having the pattern of symmetry 
described in the last paragraph of section A.1. The null hypothesis is 
the same as in the preceding paragraph. When (A), (B’), and the 
null hypothesis hold, the distribution of L,(m) is independent of c, , 
C2, *** , ¢y and is the same as the distribution of L,(m) given in [2]. 
Also, given that (A) and (B’) are true and c, = c. = --- = ¢y, the 
hypothesis of equality of within-group row effects and the hypothesis 
H,(m) are identical. 

4. Application to a Medical Experiment. Considerable attention has 
been focussed recently on the use of heparin and other anticoagulants in 
frostbite therapy. Particular interest has been manifested by the 
military services because of the expansion in the concept of global war- 
fare to include frigid zones as possible combat areas. Along this line, the 
Department of Pathology at the USAF School of Aviation Medicine has 
been conducting a series of frostbite experiments with rabbits. In these 
experiments hind legs of rabbits are immersed in supercooled solutions 
and partially frozen. It has been determined that one of the most reli- 
able measures of atrophy or necrosis is the change in weight of the muscle 
tissue. Unfortunately on a single rabbit it is not possible to obtain from 
the same leg a measurement of the initial muscle weight and a measure- 
ment of the muscle weight after exposure to cold. As a workable pro- 
cedure experimenters have proposed that only one hind leg of each rabbit 
be frozen and that the weight of the corresponding muscle in the leg 
which is not frozen be taken as the initial weight of the muscle in the leg 
which is frozen. Although the method had been used with apparent 
success in previous work, the experimenters were anxious to test statis- 
tically the underlying hypotheses. The resulting experiment provides 
an excellent example of the use of compound symmetry tests in medical 
research. 

Sixteen normal rabbits were selected at random. The rabbits were 
not subjected to any treatment, but the anterior and posterior muscles 
of both hind legs of each rabbit were removed and weighed. These 
measurements are recorded in Table 1. Let X, , X, , X; , and X, be 
the anterior left, anterior right, posterior left and posterior right muscle 


*See A.1. 
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weights, respectively. It is assumed that (X, , X2 , X3 , X4) has a 4- 
variate normal distribution. Then the data in Table 1 constitute a 
random sample of size 16 from a 4-variate normal distribution. 

Clearly, one of the hypotheses in which the experimenter is interested 
is that the two anterior muscle means are equal and the two posterior 
muscle means are equal.* In section 3 this hypothesis is designated 
H,(m) and may be written m, = m, ; m; = m,*. A second hypothesis 
in which the experimenter is interested concerns the form of the variance- 
covariance matrix associated with (X, , X, , X3 , X4). He requires 


TABLE 1 
MUSCLE WEIGHTS (GRAMS) 


Anterior Posterior 
Rabbit 
Number Left Leg Right Leg Left Leg Right Leg 
1 5.0 4.9 15.0 15.2 
2 4.8 5.0 14.2 14.3 
3 4.3 4.3 12.8 12.8 
4 §.1 5.3 14.4 14.6 
5 4.1 4.1 11.0 11.0 
6 4.0 4.0 12.5 12.6 
7 a4 6.9 19.6 19.5 
8 5.9 6.3 15.9 15.8 
9 5.3 5.2 14.1 13.8 
10 5.3 5.5 14.5 14.8 
11 5.3 5.5 16.3 15.7 
12 5.9 5.9 16.4 16.2 
13 6.5 6.8 18.6 19.0 
14 6.3 6.3 18.1 17.4 
15 6.6 6.6 17.3 15 
16 6.2 6.3 18.1 V4 


that the variances of the two anterior measurements be equal and that 
the variances of the two posterior measurements be equal, but does 
not require that the anterior and posterior variances be equal. Furth- 
ermore, in this example it is reasonable to require that the intercorrela- 
tions between anterior and posterior muscle weights be all equal. 
This hypothesis concerning the variance-covariance matrix is designated 
H,(vc) and may be written 


Mis = Pia = Pee - 


*Given that the variance-covariance matrix satisfies certain symmetry conditions. 
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In the compound symmetry analysis the test for H,(m) assumes 
that H,(vc) is true. On the other hand the test for H,(vc) is valid 
whether or not H,(m) is satisfied. A third hypothesis H,(mvc) en- 
compasses all the statements made in H,(m) and H,(vc), and a separate 
test of it may be made. The distinction to be made here is that H, (mvc) 
is a joint test of H,(m) and H,(vc) and should be rejected whenever 
either or both are not satisfied. A test of H,(m) alone would provide 
the experimenter with some knowledge of the interchangeability of the 
right and left leg muscle weights; but even if H,(m) is true, he may 
hesitate to use the right and left legs interchangeably if H,(vc) is not 
satisfied. In such cases it is probably best to make a final judgment 
only after all three hypotheses have been tested. The computational 
procedures which are given below are similar to those which would be 
employed in other problems with different groupings of the variates or 
for testing other hypotheses [H,(m), H,(vc), H,(mvc)}. 

The computations are described and illustrated in stepwise fashion, 
and a slightly different procedure is required for each of the three 
hypotheses. We have (a) a set of variates (X, , X. , X3 , X,), (b) a 
grouping of the variates into two groups of two each, (c) a significance 
level, say a(0 < a < 1), and (d) a sample, say 


Xn, Xa 


X12, 


Xiw Xan 
of N values of (X, , «++ , X,). For testing H,(vc) the procedure is as 
follows: 


(1) Compute v,;; = = 6 X)(Xi0 Xi), 


a=1 


N 
= 1, , 4) where X; = x is the sample 


a=l 


mean of X; . 
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From Table 1, we find: 


548 X,= 556 X,=15.55 X, 


15.49 


12.8844 v1. 


~ 


12.8869 V23 = 32.1750 034 = 84.4450 
Yoo = 13.2794 v,3 = 32.0350 v4 = 31.5156 
33 = 87.4000 = 31.2381 


Vay = 82.9894 N = 16 


(2) Compute ?;; = 2;; (i, 7 = 1, «++ , 4) as follows: 


= = Vie = 5 

= Vy = 2 U34 = V34 5 

Oye = VQ = rio + 14 + + 


In this example, 


~ ~ 


Vi1 = V22 


13.0819 = 12.8869 
85.1947 5, = 84.4450 


Vis = Vig = Vox = Voy = 31.7409 


~ ~ 


U33 = 


(3) Compute the determinants | v;; |, | 0;; | and their ratio: 


L, (ve) = | Uy SAY. 
| 03; | 
We find 
49 .0204 
L, (ve) 54.8802 893 = y%. 
(4) Evaluate F(y; 1°, 2,2; N)* (given in (A.15)) for y = yo , where 
in this case b = 0. Accept or reject H,(vc) according as 


F(yo ; 1°, 2,2; N) is greater than or not greater than a. 
For yo = .893, we find 


F(yo ; 1°, 2,2; N) = .920 (6 = 0, N = 16) 


*5% and 1% points of the sample criteria are tabled in [7]. 
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If we are working at a significance level a = .05, we accept H,(vc). 
For testing H,(mvc), the first step is the same as step 1 above, and 
the remaining steps are as follows: 


(2’) Compute the “pooled” means X/ = (1/2)(X, + X>), 
= (1/2)(Xs + 


then compute v/; = v/,; (7,7 = 1, --- , 4), as follows: 


= + + (N/2)[(Xi — + (X2 — 
= + vas) + (N/2)[(Xs — X2)? + (Xi — X47); 
Vio = — (N/2)[(X. — + — 
= — (N/2)[(X3 — + — XY"); 


, 


, 
U33 


~ 


= Vig = = = Vis = = Vez = 
In this example, 
Xi = 552, Xi = 15.52 
Vin = V2 = 13.1075 
= Vig = 85.2091 
Vie = 12.8613 
84.4306 
of, = of, = vi, = 31.7409 


, 
U34 


, 
V13 


(3’) Compute the determinants | »;; |, | vi; | and their ratio 


L,(mvc) = Uo , Say. 


We find 
49.0204 
(4’) Evaluate F(u; 1°, 2,2; N) given in (A.13) for u = up , where in 
this case b = 0. Accept or reject H,(mvc) according as 


F (uo ; 1°, 2,2; N) is greater than or not greater than a. 
For u, = .681, we find 


F(uo ; 1°, 2,2; N) = 653 =0,N = 16). 


In this case also we are led to accept H,(mvc). 
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The procedure for testing H,(m) is as follows: 


Compute , , 033 , (See (2) above) and 
Vie U53 , Sa (See (2’) above). 


(2”) Compute 


Qn V12) ss) 
(3) Evaluate F(z; 1°, 2,2; N) (given in (A.9)) for z = 2 , where in 
this case b = 0. Accept or reject H,(m) according as 
F(z, ; 1°, 2,2; N) is greater than or not greater than a. 
For z = .763, we find 


F(z; 1°, 2,2;N) = .140 (6 =0,N = 16). 


Accordingly we accept H,(m) at the 5% level of significance. 

On the basis of the above calculations the sample of measurements 
on 16 rabbits is consistent with any of the three hypotheses, H,(mvc), 
H,(vc) and H,(m). In other words, the experimenter has no evidence 
in his sample to indicate that any of the conditions exists which would 
make him hesitate to use the right and left leg corresponding muscle 
weights interchangeably. Thus as far as the statistics go, he has no 
need to fear that the substitution of the weight of a muscle in the un- 
treated leg for the initial weight of the corresponding muscle in the 
frozen leg will lead to erroneous results. In a broader sense it might 
be said that the error involved in making the substitution is on the 
average no greater than the variation observed from rabbit to rabbit. 
In short, his procedure seems entirely valid. 


= 20 , Say. 


APPENDIX 
A.1. General Expressions of the Compound Symmetry Hypotheses 


The hypothesis expressed in (1.1) (and in (2.1)) can be generalized 
as follows. Suppose there are qg physical factors in an experiment, 
where b of the factors are measured once each and the (b + a)th factor 
is measured n, times (a = 1, --- ,h;b +h = q;n, > 2). The number 
of chance quantities would then be b + n, + --- m = #, say, and they 
would fall into b + h groups. A convenient notation for indicating 
the grouping is* (1°, n, , --- , ,). [For example, the grouping (1°, 2,3) 
would mean that there were seven variates X, , X,, --- , X7, falling 
into four groups as follows: (X,), (X.), , Xs), (X;, Xe, X7). In 
(1.1) the grouping is* (2,3)]. ‘Stability’ can be interpreted as meaning: 


*The simpler notation (m , *** , m,) is used when b = 0; if, moreover, m1. = *** =n, = 7, say, 
the notation (n") is sometimes used. If b = 1, 1° is replaced by 1 in the proposed notation. 
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(1) for each group all means in the group are equal, all variances in 
the group are equal, and all correlations between distinct chance 
quantities in the group are equal; and (2) between any two distinct 
groups all correlations are equal. This ‘stability’ hypothesis will be 
represented by H,(mvc). The hypothesis obtained by removing from 
H,(mvc) the condition on the means (see (1)) will be represented by 
H,(vc). A third hypothesis, H,(m), presupposes that H,(vc) is true 
and asserts that the condition on the means is true. 

The hypothesis expressed in (1.2) (and in (2.2)) can be generalized 
similarly; however, the restriction for this case that all sets of times 
be identical requires that b = 0 and that n, = n. = --- = m = 1, 
say.* The generalization of (1.2) (and (2.2)) would leave (1) above 
unchanged but would substitute the following statement for (2); 
between any two distinct groups the correlations between simultan- 
eously measured chance quantities are equal and all remaining corre- 
lations are equal. (For (1.2) and (2.2) the grouping is (2’)). This 
‘stability’ hypothesis will be represented by H,(mvc). Hypotheses 
H,(vc) and H,(m), which are similar to H,(vc) and H,(m), respectively, 
will also be considered. In the remainder of this section a more general 
method of describing the six hypotheses will be given. 

When the grouping of (X,, --- , X,) is (1°, m., «++, m,) and H,(mvc), 
H,(vc) or H,(m) is true, the variance-covariance matrix || A*’ || of 
(X, , --: , X,) is as given in (A.1) (the dashed lines in (A.1) indicate 
the grouping (n, , m2, --* , %) of the last ¢ — b rows and the last t — b 
columns of || A*’ || (see (2.1), (2.2), and (2.3)): 

The hypothesis H,(vc) asserts merely that || A*’ || is as given in 
(A.1). H,(mvc) asserts that H,(vc) is true and that for every a the 
true means in the (b + a)th group of means are equal (a = 1, --- , h); 
H,(m) asserts that this condition on the means holds, given that H, (vc) 
istrue. _ 

When H,(mvc), H,(vc), or H,(m) is true, then b = 0, the grouping 
is (n") (t = nh), and || A‘ || can be divided into h? n x n arrays within 
each of which all diagonal elements are equal and all off-diagonal 
elements are equal (see (2.2) for the case of a grouping (2”); also see 
[2, p. 453]). H,(ve) asserts that || A*’ || has such a form; H,(mvc) 
asserts that H,(vc) is true and that for every a the true means in the 
a-th group of means are equal (a = 1, --- , h); Hi(m) asserts that this 
condition on the means holds given that H,(vc) is true. 


A.2. General Expressions of Sample Criteria. 
Let Ov(tia » *** » Xta) (2 = 1, --+ , N) be a sample of size N. 


*See preceding footnote. 
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. H 
Abo Chi co cee ce Cobh 
cu Ch AlBi+++ Bi DY DY Dih +++ Dir 
cu Bi Al: pe Dw Du Du 
cz cee Di Dx A2B? B 
|] = . DY - BtA2 (A.1) 
ca Du DY B A? 
Cu Cbh AABh +++ Bh 
Ch eee Cobh . Bh eee Ah 


The sample criterion L,(vc) = | v,; |/| %.; |. || Vs; || has the same 
appearance as || A‘’ || in (A.1); the elements of || 2,; || are averages of 
elements in || v,; ||. Lets = 1,--- ,bandi,,j.,%,j2 =b+ + 1, 
,b4+7.(n. = +--+ +7. = 0;a = 1, --- ,h); the elements 
of || ¥,; || may then be expressed as follows: 


~ 


ie 


= Ve » Say; (A.2) 


Visie = 1) = Wa , Say, (i, # 
Vieira: /(MaNa’), (a a’; a, a’ 1, h). 


The sample criterion L,(mve) = | |/| v%; |-|| ||, like || II, 
has the same appearance as || A’’ || in (A.1); moreover, v/,.5 = Use: 
and Visio’ (see (A.2)), but 
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(A.3) 


~ 


= Wa = 1) (X;-, = 20, ’ say, # 
where X/ = X;./ne 


The criterion L,(m) = L,(mvc)/L,(vc) = | |/| vi; |. Determi- 
nants like | 2,;; | and | v/; |, having the appearance of | A‘ | (see (A.1)), 
can be simplified (see (3.3) in [2, p. 451]). From this simplification it 
follows that 


h 
L,(m) = TT] {(@. — — (A.4) 
(see (A.2), (A.3)). Note that L,(m) is independent* of b. 
When an H hypothesis is under test the grouping is (n*) (t = nh), 
(thus b = 0). The criterion L,(vc) = | v;; |/| ;,; |, where 


1 


t’e.t’e n(n — 


ke = i. + (a’ — a)n) 


n(n 1) ieh’e’ 
+ (a’ — a)n;h,. , hi. = (a’ — 1)n +1, --- , 


The criterion L,(mve) = | |/| |, where 


Thats = Bais — #0, 


= Viens — 


*When H,(m) is true, the distribution of Li(m) is independent of 6 (see Tables 2 and 3). 
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The criterion L,(m) = L,(mvc)/L,(vc); this ratio can be greatly 
simplified also (see (3.5) in (2, pp. 453-454)). 

When H,(mvc) is true, L,(m) and L,(ve) are independent; an en- 
tirely similar remark holds regarding H,(mvc), L,(m) and L,(ve). 


A.3. Means, Variances, and Distributions of Sample Criteria When 
the Corresponding Null Hypotheses are True. 


The numbers on the unit interval, 0 < x < 1, form the set of 
possible values of each sample criterion. The exact distribution of 
each criterion (when the corresponding null hypothesis is true) has 
been identified as a product of independent beta variates (see [2]). A 
beta variate, say X, has the c.d.f.: 


P(X <2) = (BP, As) 
where P, Q > 0 and 


I,(P, Q) is termed the Incomplete Beta Function ratio (see [3], [4]). 
The mean and variance of X are, respectively: 


P/(P + Q), + +Q+ (A.8) 


When N is large, the distribution of —N log, (criterion) has ap- 

proximately a x°-distribution with a number of degrees of freedom that 
depends on the grouping, (1°, n, , --- , ma), (see [2, p. 467]). Small 
values of the criterion are significant; thus, large values of —N log, 
(criterion) are significant. _ 
In formulas for the c.d.f.’s of L,(muvc), L,(ve), L,(m), L,(mve), 
L, (ve), L,(m), the independent variables used will be u, y, z, u, ¥, 2, 
respectively. For any grouping (1°, m, , --- , m,) and sample size, N, 
let F(u; 1°, , +++ mx; N) be the e.d.f.* of L,(mve), , F(; n"; N) 
be the e.d.f. of L,(m). C.D.F.’s of various criteria for various group- 
ings (when the corresponding null hypotheses are true) are given in 
Table 2, together with the numbers of degrees of freedom of the x’ 
distributions discussed below (A.8). The functions listed under 
“C.D.F.” in Table 2 are Incomplete Beta Function ratios; however, it 
should be noted in the table that in some cases not the criterion but a 
simple function of the criterion is a beta variate (e.g., L,(mve; 1’, 3)). 
The mean and variance of each criterion (or of the simple function of 
the criterion) in Table 2 can be obtained from (A.6) and (A.8). 


¥i.e., P,(Li(mve) <u) = Flu; ,m;N). 
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TABLE 2. 
Degrees of 

Criterion-Grouping C.D.F. Freedom 
L,(mve; 1", 2) 2) b+2 
L,(mve; 1”, 3) IAN — b— 3,643), = Vu) 2b + 6 
L,ve; 1°, 2) 1) 
L,(ve; 1°, 3) I,(N —b—3,b+2),y = Vy) 4 
Lim; @ =D) 

= 
L,(ve; 2°) LAN — 4,2), = Vy) 4 
L, (moe; 2%) — 4,3), = Va) 6 
L,(m; n”) — Dm 1) 1,0 1], 

[2’ = (Wave?) 2(n 1) 
= N-—-hh 
L,(m; 2") h 


The exact distribution of certain other criteria (when the corre- 
sponding null hypotheses are true) are given below: 


F@; 1°, 2, 2; N) 


(N 
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~ 2(N + 1) 
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+ (2)tav + 3)(N + (2 3) 


(A.9) 
1, 3,3; N) = 
(A.10) 
— W — Ney" log, @), = V2. 
Fi; 2°; N) = 
Sul”? (A.11) 
W-5 


N 


6 —\3/2 
3 )a (a2) 


+ (N — 5)(y) are tanh Vi- 


x — 7)" + ( 


— (N — 4)(y)’” are cos vit. 


N-1 1) 


Plu; 1°, 2, 23 N) = 2 19 


=f 
: 


278 


BIOMETRICS, SEPTEMBER 1950 


+ [Bw - 0-4, 


b (A.13) 
x 2) 
I (2, b+3—- a) 
XW-b-it+9' 2 
where = (b + 2)!/[(q)(b + 2 — 
Fu; 2,3; = 3) 
~ | 3, 5, 5 |" 
- (A.14) 


= 12 U 
X are cos Vi 


4 12 8 1 


N-21 
1", 2,35) = 1% 52,5) 


(A.15) 


N-b-4+¢@ 
2 
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N-3 
F(y; 2,3;N) =y 
(N — 2)(N — — 4)(N — 5) by” 
6 2 3y log. u\ 
1°, 2, 3; N) = 2 »9 
(A.17) 


2 

The means and variances of these criteria (when the corresponding 

null hypotheses are true) for the cases covered in (A.9) through (A.17) 


are given in Table 3, together with the numbers of degrees of freedom 
of the x’-distributions discussed below (A.8). 


+ arc tanh +/1 — z. 
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A SIMPLE TREND TEST WITH APPLICATION TO 
ERYTHROCYTE SIZE DATA 


G. Exrvine ann J. H. Wurtiocr! 


Cornell University? 


1. Summary. A simple rank statistic is used to detect a trend in 
erythrocyte size data; the method may probably prove useful in other 
cases where a more refined regression analysis would be too laborious. 
The proposed criterion is essentially Kendall’s S (or 7) pooled over 
several] sets of observations. From the statistic in question, an estimate 
of slope may be derived. The efficiency of this estimate may serve as 
a measure of the efficiency of the test. It is calculated for the case 
that the observations refer to equidistant time points and are normally 
distributed around regression lines of constant slope. 

A short description of the practical procedure is found at the end 
of section three. 

2. Although sheep erythrocytes are only one third as large (27 cu. 
microns) as those of man (90 cu. microns), the data which we have. 
accumulated over the past four years indicates that the normal variance 
in volume is of the same order of magnitude. Diagnosis of several 
dietary deficiencies rests upon determinations of changes in both shape 
and size of erythrocytes. It is obvious that a change of 10 cubic 
microns in average cell volume represents a much more severe disorder 
in sheep than in man. Accordingly, it becomes desirable to diagnose 
such changes before they become lethal. The blood samples from 
anemic sheep used in our experiments (Whitlock, 1949) were tested by 
turbidimeter (Whitlock, 1947) with a red and green filter as well as 
being subjected to routine erythrocyte count and hematocrit determi- 
nations. The erythrocyte count and turbidimetric determinations were 
made on the same diluted sample of blood. Another sample was used 
for the hematocrit determinations. The red filter turbidimetric reading 
could be translated into expected counts and expected hematocrits. 


1The method was first suggested to the latter author by Professor W. Feller of the Mathematics 
Department, Cornell University. The authors are also indebted to Professor J. W. Tukey, Princeton 
University, for improvements in the computing scheme.— Research supported in part by ONR. 

?Mathematics Department and New York State Veterinary College, Cornell University. 
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The difference between actual and expected counts and hematocrits 
gave us a measure of cell volume and cell shape. The difference be- 
tween the red and green filter turbidimetric readings was translated 
into hemoglobin values. Cell volume also could be determined by the 
ratio between the hematocrit and the cell count. With two sub- 
samples, our technic gave us two determinations of cell volume, one of 
cell shape, and one of hemoglobin concentration per cell. The de- 
termination of red cell count, hematocrit, and hemoglobin as routinely 
done in most laboratories requires three sub-samples and yields one 
determination of cell volume and one of hemoglobin concentration per 
cell. In other words, the turbidometric technic yields more information 
from fewer sub-samples. 

Hematocrit determinations are usually reported as packed cell vol- 
ume percentage and are probably only accurate within about +5%. 
The erythrocyte count error lies between 9 and 16%. When errors are 
so large in comparison to the effects expected, all animals in an experi- 
mental group would not show a uniform trend in measured cell volume 
even if a well defined trend were present. The essential problem, 
therefore, is to determine whether or not a trend in a series of measure- 
ments of a group of animals is sufficiently pronounced that mere chance 
would not account for its appearance. Regression technics would solve 
the problem, but they are too laborious to be worthwhile. 

3. Under these circumstances, a rank method was adopted. Beside 
being quick, it has the advantage that the probability distribution (and 
thus, particularly, the percentage points) of the test criterion are inde- 
pendent of any assumptions about the distribution of the observed 
values as long as no trend is present. 

Table I lists data from previously reported experiments (Whitlock, 
1949) indicating determinations of erythrocyte sizes in various anemic 
sheep. To the size values x; (t = 1, --- , n) of each individual 7, rank 
numbers (in parentheses) are assigned in order of increasing magnitude; 
e.g., for sheep number 32, the cell volumes, in order are: 


27.5 290 313 28.0 
and the corresponding rank numbers: 
(R) 1 3 4 2 


If there is a strong positive (or negative) trend in the x-value, the rank 
numbers may be expected to occur nearly in the natural order 1, 2, --- , 
n (or n, ++ 2, 1, respectively). In order to measure the degree of dis- 


1[f two or more z-values coincide, it is usual to assign to all of them the average of the correspond- 
ing rank numbers. 
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TABLE I 


Determination of Cell Size (Cubic Microns) 


Sheep 
Number Ist 2nd 3rd 4th 5th k 
25 | 25.4 (1) | 26.5 (2) 0 
63 24.1(1) | 24.9 (2) 0 
26 26.3(2) | 24.1 (1) 2 
55 27.3 (1) | 29.0(2) | 29.4 (3) 0 
39 23.7 (1) | 28.2 (2) | 28.4 (3) 0 
66 25.0 (1) | 25.7 (2) | 29.6 (3) 0 
47, 27.5(1) | 27.9 (2.5) | 27.9 (2.5) 1 
6 23.0 (1) | 28.4 (3) | 25.0 (2) 2 
75 25.2 (1) | 26.6(2) | 27.8 (3) 0 
5 24.1 (2) | 22.9(1) | 28.9(3) | 30.1 (4) 2 
35 21.3(1) | 23.6(2) | 25.8(3) | 29.4 (4) 0 
32 27.5 (1) | 29.0(3) | 31.3(4) | 28.0 (2) 4 
45 25.4(1) | 32.5(3) | 33.2(4) | 31.6 (2) 4 
13 28.5 (4) | 28.1(3) | 26.9(1) | 27.0(2) 10 
74 28.4 (2) | 27.2(1) | 29.6(3) | 36.0(5) | 31.1 (4) 4 
40 29.6 (1) | 30.0(2) | 33.5(5) | 32.0(3.5) | 32.0(3.5)| 5 
34 24.9(1) | 26.3(2) | 31.3(3) | 38.0(5) | 32.8 (4) 2 
K = 36 


order in the rank sequence, we count the number of inverted pairs, i.e. 
the number of pairs of rank numbers with the smaller rank to the right 
of the greater. In our example there are two such pairs, namely (3,2) 
and (4,2); the remaining pairs (1,3), (1,4), (1,2), and (3,4) are not 
inverted. Alternatively, the number of inverted pairs is obtained by 
noting, for each item in (R), the number of smaller items to the right, 
and adding these numbers. We take twice the number of inverted pairs 
and call it k, the redoubling being to facilitate certain calculations in 
the following. 

The statistic k is equivalent with Kendall’s S or + (Kendall, 1945, 
I, 391)’; in fact, it is easily verified that 


k = n(n — 1) —- S = 3n(n — 1)(1 — 7). 


If there is no trend, k is a random variable (ef. loc. cit. p. 403) with 
expectation 


1Tukey has proposed to call the lesser of k and S the Kendall sum. 
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a, = 3n(n — 1) (1) 


and variance 


= 0 — 1)(2n + 5). (2) 


The distribution of S is tabulated by Kendall for n = 2, 3, --- , 10 
and we are thus, as far as only a single individual is concerned, in a 
position to judge of the significance of an observed number of inversions. 

In general, however, we want to pool the information obtained from 
several, say r, individuals. For this purpose, the simplest procedure is 
to add the single inversion numbers to form a pooled inversion number 


(3) 


The exact distribution of K—even under the null hypothesis that no 
trend is present—is complicated; as soon as r is not very small, the 
distribution is, however, fairly near to the normal, with mean 


A= 5 1) (4) 
and variance 
S = 3 > n(n; — 1)(2n; + 5), (5) 


where n; denotes the number of observations on the individual 7. We 
thus simply have to compute A and S and then judge of the significance 
of K according to standard rules. 


TABLE II 

n An 
2 1 3 
3 3 ll 
4 6 26 
5 10 50 
6 15 85 
21 133 
8 28 196 
9 36 276 

10 45 375 
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The procedure may be facilitated by using Table II which gives a, 
and 30; for n = 2, --- , 10; both are integers. The applier simply has 
to (a) count the redoubled number k of inverted pairs of rank numbers 
for each individual, (b) take a, and 30; from Table II for all occurring 
numbers 7 of observation, (c) add these three items over all individuals, 
thus obtaining the quantities K, A, and 3S’, (d) take the square root 
of 3S’/3 thus obtaining the standard deviation, S, of K; (e) divide the 
difference K — A by S. If the difference K — A is more than twice 
the standard deviation, the trend may be judged significant. If the 
difference is more than thrice the standard deviation, the results are 
highly significant. 

The calculations are given in Table III. The resulting normal 
deviate amounting to more than four times its standard deviation, the 
trend is clearly significant. It should be noted that we are testing for 
trend in either direction, thus, the significance level applied to the 
normal deviate should be that of a two-sided test. 


TABLE III 
CALCULATION OF NORMAL DEVIATE IN EXAMPLE 


Observations Number Number of Series | Number of Series 
per series of series X an X 3e%, 

2 3 3 9 
3 6 18 66 
4 5 30 130 
5 3 30 150 

Sums 17 A= 81 3S? = 355 

K= 3 

K-—A= —45 S? = 118.3 


—45 
Deviate = _ = -4.13 
V 118.3 


4. It may be of a certain interest to study the efficiency of the test 
used. For this purpose, we first have to choose an appropriate prob- 
ability model for the phenomenon investigated, valid for the general 
case that a trend is present. Of the parameters necessarily involved in 
this model, one, say 6, has to be a measure of the trend; without loss 
of generality, we may assume that the value 6 = 0 corresponds to the 
null hypothesis that no trend is present. The statistic k (or the pooled 
statistic K) yields, suitably transformed, an unbiased estimate of 6. 
We will calculate the variance of this estimate and compare it to the 
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theoretic lower bound of the variance of any unbiased estimate of 0 
obtainable from the original sct of observations (ef. e.g. Cramer, 1946, 
p. 490-497). The ratio of the latter variance to the former measures 
the efficiency of the estimate of 6 and thus, in a certain sense, the 
efficiency of the test proposed. 

The most easily managed probability model is given by the hy- 
pothesis that the observed values x are normally distributed, with the 
mean 0 and standard deviation o, around straight lines of common 
slope 8; the level of these lines may vary from individual to individual. 
This hypothesis may be expressed by the formula: 


z(t) =a; + (6) 


where the é;,’s (the pure random effects) are normal (0, c). As a measure 
of the essential trend, i.e. the trend compared with the standard devia- 
tion of the random fluctuations, we introduce @ = B/c. 

Let us now, to begin with, study the efficiency of a single statistic 
k for testing the null hypothesis @ = 0. The probability distribution 
of k is, without difficulty, seen to depend only on @ and the number n 
of time points, the latter being supposed equidistant. Its analytical 
form is complicated; we may, however, easily construct an unbiased 
estimate of 6, based on k and valid for small @’s by expanding the ex- 
pectation of k in a Taylor series 


E(k) = a, + (7) 
and neglecting terms of second and higher order. Obviously, 
k— a, 


is an estimate of the kind mentioned. Here, a, is given by (2); for b, 
one obtains, by some calculations here omitted, 


1 
b, = ——= n(n’ — 1). 9 
) (9) 
Thus, according to (3), 
n + 5/2 


On the other hand, the lower bound of any unbiased estimate of 6 
turns out to be ' 


(10) 


12 6° 
+ (11) 


n(n 


inf var (t) = 
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hence, for vanishing 6, the efficiency of k is 
(12) 


We now turn to the efficiency of the pooled inversion number K. 
Adding (7) over all individuals we get 
E(K) = A+ (18) 


where A = Ya,,, B = 2b,,. This yields for small 6’s the unbiased 
estimate T = (K — A)/B, with the variance 


4 — 1)(n; + 5/2) 


B (ne — DF 
The theoretic lower bound of the variance is found to be 
12 
inf var (T) = + (15) 
hence, for small @, the efficiency of K is 
2 
3 mn(n; — 1) (16) 


Zn,(n; — 1)(n; + 5/2)’ 


If all individuals show the same number n of observations, (16) is 
identical with (12). In this case, the efficiency increases from e. = 
2/x = 0.637 toe. = 3/m = 0.955 as r increases. 

If the n,’s are unequal, (16) depends on the relative frequency of 
the different n,’s. If e.g. the values n = 2, 3, 4, 5 occur with equal 
frequency, e is found to be 0.743. The efficiency could in the case of 
varying n,’s be somewhat improved by using a weighted sum instead 
of the straight sum K = 2 k;. The improvement seems, however, to 
be too small to justify the more complex computations. In the numerical 
example just mentioned, the improved efficiency is found to be 0.744. 
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A NOTE ON THE FOUR BY FOUR LATIN SQUARES 
W. J. YoupEN 


National Bureau of Standards, Washington, D. C. 


ii HAS BEEN pointed out by R. A. Fisher (1) that randomized blocks 
have the advantage that it is possible to isolate the appropriate 
components of error applicable to any specified comparison of the treat- 
ments. This is useful when there is any reason to question the use of 
the pooled error term. 

The Latin Square effects a double elimination of block differences 
and there appears to be no reference to separating the residual inter- 
action of rows and columns into components which could be associated 
with the comparisons among the letters which designate the treatments 
in a Latin Square. It is possible to achieve this segregation for the 
Second Transformation Set of 4 x 4 Latin Squares (2). 

The standard square of this set is 


The orthogonal subscripts are used to distinguish the various replica- 
tions. Inspection shows that if, e.g., the comparison of A and C is of 
particular interest the error term belonging to this comparison is 
obtained by taking the following two comparisons of A and C, since 
these contrasts are independent of row and column effects. 


(A, + A) = (C5 + =d, 
and 


(A, + A;) (C; + = d, 
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The discrepancy between these two estimates of twice the difference 
between A and C constitutes one of the six degrees of freedom available 
for the error term in a 4 x 4 square. Simply square the quantity 
(d, — d,) and divide by 8. The other five degrees of freedom corre- 
sponding to the remaining five contrasts among the four letters A, B, 
C and D may be individually obtained in the same way. A simple 
numerical example will reveal that the sum of the squares for the six 
individual degrees of freedom isolated in this way do total to the sum 
of squares for error obtained in the usual way. 

Obviously one degree of freedom for error isn’t very helpful. It 
will sometimes happen, as in a recent experiment on the thickness of 
protective coatings on metals that a number of replications (in this 
case eight) of the Latin Square are available and the individual degrees 
of freedom from each square may be accumulated. There was, in this 
instance, a real reason for isolating these individual components in the 
error. The “treatments” were measurements made by four different 
laboratories and prior experience had shown that the laboratories did 
not achieve the same precision in the measurements. This arrang:- 
ment made it possible to determine and appraise the bias between 
labcratories. 

This feature of the 4 x 4 squares may also be a helpful instruction aid 
since it makes possible a direct and easily understood computation of 
the error term for this Latin Square. It is certainly not apparent to 
many who use Latin Square designs that the residual sum of squares is 
the error term. 
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AN INVERSE SAMPLING PROCEDURE FOR BACTERIAL 
PLATE COUNTS 


MARTIN SANDELIUS 


University of Uppsala and University of Washington 


A COMMON PROCEDURE for estimation of bacterial density consists in 
counting the total number of colonies of bacteria on a circular 
plate. Since each colony has grown from a single bacterium living on 
the medium used on the plate, the result of counting is an estimate of 
the density of living bacteria per cc of the fluid from which the sample 
is taken. Experience has shown in many cases that the number of 
colonies found approximately follows a Poisson distribution. 

Often the number of colonies found will be so large that this counting 
procedure becomes laborious. An alternative procedure would be to 
apply the principle of inverse sampling, recently adopted in the bi- 
nomial case (cf. (1) and (2)). We choose on the circular plate a radius 
vector at random and determine the smallest sector beginning at this 
vector which contains exactly k colonies, each colony being reduced to 
a point by considering only its centre, and any colony with its centre 
on the initial radius vector being excluded. If the plate contains less 
than k colonies it may be assumed that the sampling process continues 
on other similar plates, in which case the final sector will have an angle 
greater than 360°. Further it will be supposed that the quality of the 
plates is so good that the Poisson distribution of the numbers of colonies 
within equal sectors is the same. 

Choosing the length of the whole circumference of a plate as the 
unit, we let x denote the length of the sector determined by the inverse 
sampling procedure. Denoting by a the density of bacteria per cc of 
the fluid sampled and by v the amount, in cc, of this fluid on each 
plate, it follows (cf. e.g. (3), Sect. 3) that 2 avx has a chi-square distri- 
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bution with 2 k degrees of freedom. Assuming k greater than 2, it 
follows that (k — 1)/vz is an unbiased estimate of a, the variance of 


this estimate being a’/(k — 2) (cf. e.g. (4), Sect. 33.3, Ex. 3). Further 
the interval 


2 2 
(x20 
Qvx 


is a confidence interval for a corresponding to the confidence coefficient 
1 — a, where 0 < a < 1 and x3 is given by the relation Prob 
(x’ < x3) = p, x” being taken with 2 k degrees of freedom. 

Suppose now that, instead of having the same concentration of 
bacteria on each plate, different concentrations are used. Then usually 
3 or 4 plates will be sufficient to find at least k colonies. Let e.g. the 
1st plate contain v cc of the fluid sampled, the 2nd 10 v ce, the 3rd 100 v 
ec, etc. The same method is still applicable; the only modification 
necessary is to change the length of the circumference of the 2nd plate 
to 10, of the 3rd plate to 100, etc. 

The author wishes to express his gratitude to Dr. Douglas Chapman, 
University of Washington, who read the manuscript and made valuable 
suggestions. 
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QUERIES 


QUERY: In connection with partial regression and multiple 
80 co-variance I have a question which disturbs me although it 


probably has a very simple explanation. In carrying out the 
calculations we find the following: 


Sa, = —310.12 
S22, = —660.24 
10,462 
The partial regression coefficients are as follows: 
= 0.35313 
— 0.95405 
= 0.33149 


As I understand it, one may state that the product (byx, -x223)(Sx,) 
may be considered as expressing the portion of the sum of squares of Y 
that are associated with the variations in X,. This will, however, obvi- 
ously be a minus number. The other two are positive values. I would 


greatly appreciate if you could explain to me the interpretation of the 
negative values when they are obtained. 


byx2 


It is unfortunate, but multiple regression does not afford 
a nice method for separating the sum of squares into 
orthogonal batches attributable to the several variables. 
Perhaps you can get the information you wish by looking at the 
standard partial regression coefficients with their associated standard 
errors. Or, you might use the following device which yields an exact 


test. Analyze the variance (a regression with 3 independent variables is 
used as illustration) in this way: 


ANSWER: 


Source of Variation Degrees of Freedom Sum of Squares 
2 Independent Variables n-3 (1 — Ry. 12?)Sy? 
3 Independent Variables n—4 (1 — Ry.123?)Sy? 
Difference 1 (Ry.123 — Ry,12?)Sy? 


The difference with 1 degree of freedom is orthogonal to the sum of 
squares immediately above it, so F = (Ryo; — Ry.2’)(n — 4)/ 
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(1 — Ry..23”) tests the hypothesis that the two o’s are the same. 
Rejection of the hypothesis leads to the conclusion that the third inde- 
pendent variable is, in the population, individually associated with Y. 


QUERY: Ina preliminary experiment on the percentage vitamin 
81 content of replicate samples at different age levels the following 
results were obtained: 


PERCENTAGE VITAMIN CONTENT OF REPLICATE 
SAMPLES AT DIFFERENT AGE LEVELS 


Age (Days) Vitamin Content (%) Means 
0 36, 29, 30, 32 31.8 
3 47, 11, 50, 55 40.8 
7 24, 57, 57, 59 49.3 
10 21, 59, 55, 50 46.3 
14 56, 54, 51, 33 48.5 
17 53, 31, 59, 59 50.5 
21 73, 72, 42, 64, 62, 64 62.8 
REGRESSION ANALYSIS 
Source of Variation d.f. SS. MS. F 
Between Age Groups 6 2630 438 2.25 
Regression 1 2249 2249 11.53** 
Deviations 5 381 76 
Within Age Groups 23 4474 195 
Total 29 7104 


I would like you to comment on the following: 

(1) Are we justified in doing a regression analysis when the F ratio 
for “between age groups” is not significant? 

(2) We have reason to believe that the data are best represented by a 
cubic equation, yet we get no significant deviation from linear regression. 
Can this be explained on the basis of an excessively large error term? 

(3) We are planning to repeat this experiment using eight equally 
spaced age levels (0, 3, 6, 9, 12, 15, 18, 21). The samples are from a 
natural product and the assays are expensive and time consuming. We 
are contemplating an increase to 10 or 15 replicates at each age level. 
Would you consider this a sufficiently large sample in view of the large 
intrinsic error? 
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(1). Yes. The form of the analysis is inherent in your 
ANSWER: experiment and could have been incorporated in your 
project outline. 

(2) There are four ways in which the non-significant deviation from 
linearity might be accounted for. (i) The population regression may be 
straight—this is the evidence furnished by the sample. (ii) The curvi- 
linearity in the population may be too small to be detected by a sample 
of any feasible size. (iii) The sample drawn may be too small to detect 
an existing curved regression. (iv) Your sample may be one of the unus- 
ual kind that happens to show small deviations though drawn from a 
population with detectible curvilinearity. 

(3) This is an appropriate question but it cannot be answered from 
the evidence given. If you will tell me, either graphically or algebraic- 
ally, the details of the cubic you expect, I might be able to estimate the 
size of sample required by use of your sample deviations from the 
expected regression. 

P.S.: The reply to this said, “This would be clearer to you if I could 
reveal the true nature of the data, which unfortunately I cannot.” 
So, that’s that. 


QUERY: The question has arisen as to whether the analysis of 
82 variance procedure may be applied to data which are not repli- 
cated. The data consist of 225 items as follows: 

The material consisted of 3 sweet corn varieties. Each variety was 
harvested at 3 predetermined stages of maturity as indicated by an 
objective test. The field replicates were combined, and each variety at 
each stage of maturity was divided into 5 lots, each being a different 
temperature treatment. The lots were then analyzed for certain chem- 
ical components at 5 successive periods of time. Since the zero period of 
time could not be subdivided into different temperatures, we are planning 
to repeat the value for this first period five times in the calculations. 
This should reduce the degrees of freedom from 224 to 188. 

Is there any reason why each set of chemical determinations cannot 
be analyzed by the variance methods? 


The superficial answer to your question is “No”. There is 
no reason why the arithmetic of analysis of variance may 
not be done in any one of many ways. The pertinent ques- 
tion is this: ‘Under what circumstances will the analysis of variance 
produce meaningful results?” The conditions for such results were 
clearly stated by Eisenhart in this journal, Vol. 3, pages 1-21, 1947. 

I shall assume that the collections of material from the original ran- 
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domized blocks experiment were taken at random positions within the 
plots, equal amounts from each plot, so that the environmental effects 
were properly incorporated in the harvested corn: it may be that all the 
corn from each plot was included. I assume also that the harvested 
material was randomly assigned to the lots for temperature treatments. 
This would produce 45 lots which would presumably meet all specifica- 
tions for a meaningful analysis. 

The 5 (or 4) chemical determinations on a single lot can scarcely be 
expected to be independent, but deviations from linear regression might 
be. If not, the second degree regression might have some meaning and 
deviations from it would probably be random. If your time intervals are 
equal and if you had run your analysis on each lot at the initial time, it 
would be easy to analyze the variance, perhaps as follows: 


Variety, V, 2 degrees of freedom 
Stage of Maturity, M, 2 

VM 
Temperature, T, 
TV 

TM 

TVM 

Linear Trend, L, 


OORRNNH 


LTVM 
Remainder 


| 


Total 


8 


The analysis with 5 determinations on some lots and 4 on others 
would be messy. I think your plan of repeated use of the initial determi- 
nation in each lot will not introduce much inaccuracy into the regressions. 
I say this partly because, with only three replications of the varieties, 
great precision cannot be expected. 

If theoretical considerations and examination of your data suggest 
that curved regressions are required, and if this is verified by a large mean 
square in ‘Remainder’, you can extend the analysis of variance to include 
second degree terms. 


QUERY: Using wooden blocks secured from the sapwood of a 
83 single log, a decay experiment was conducted in eight decay 
chambers. Each chamber originally contained eight blocks, but 
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one of these was removed from each chamber at the end of each 2 week 
period until all the blocks had been removed. This gave an 8 by 8 experi- 
ment with 8 different decay chambers and 8 different decay periods. 
The variable being studied was weight loss due to decay expressed as a 
percent of the original oven dry weight. It turned out that both the 
decay chambers and the length of the decay periods had a significant 
effect on the percent weight loss due to decay. In addition it was desired 
to see if the original specific gravity of each piece had any effect on the 
resistance of the piece to decay. In answering this latter question I 
resorted to an analysis of covariance. For the wood decay organism 
Polyporus versicolor, I got the following covariance of original specific 
gravity times 10’(= zx) and the percent weight-loss due to decay (= y): 


Errors of Estimate 


Source of d.f. Sx? Szry Sy? 
variation Sum d.f. Mean 
of sq. Square 
Decay chambers 7 6.86 |—11.48 | 1113.36 
Decay periods 7 7.11 | 60.64 | 2887.36 
Error 49 134.26 | 20.98 | 909.76) 906.39 48 18.88 
Total 63 148.23 | 70.14 | 4910.48 


Error regression coefficient, b = 0.1563 


Student’s “t”? = 0.1563 —— = 0.417 
18.88 


Since this value of “‘t’’ does not even approach significance, I assume 
that this experiment does not indicate any relationship between original 
specific gravity and decay resistance of the wood when freed of the 
effect of decay chambers and decay periods. 

Is my interpretation of the meaning of error regression correct? 

Is it correct to calculate, test, and interpret the error regression 
coefficient as I have done? 


Your interpretation of the covariance is correct. All the 
ANSWER: evidence points to the conclusion that the regressions in 

decay chambers and decay periods, as well as in error, 
may be no more than sampling variation in samples drawn from a 
population whose regressions are all zero. 


|) 
| 
| 
| 
i 
: 
| 
gees, 
' 
| 
Be, 
ae. 
| 
i me 


298 BIOMETRICS, SEPTEMBER 1950 


QUERY: We frequently must compare percentages, e.g. of 

84 mortalities, of a treated and an untreated group of animals in 

toxicological research. Usually the total number of animals in 

each group is less than 30. We have used 4 methods for this comparison. 
These are:- 

(1) Calculation of critical ratio (C.R.) as in Garrett’s Statistics in 

Psychology and Education, Longmans, Green and Co. (1947) p. 218, 

where 


Pidi P2G2 
The probability associated with the C.R. was obtained from a table of 
fractional parts of the total area under the normal probability curve. 
(2) Calculation of the uncorrected chi-square. 
(3) Calculation of the corrected chi-square as in Snedecor’s Statis- 
tical Methods—Iowa State College Press (1946). 
(4) Calculation of t-test using ‘“‘Student’s” distribution where 


= ( — Pe 
xd; + N,+ 


and >> d’ = >> (x — =)’ with death = 1, survival = 0. 
A list of the probabilities associated with these 4 methods on several. 
sample calculations is given below. 


(N, + N2 — 2) 


Probability Value Associated with 
Mortalities C.R. Uncorr. Corrected t-test 
Chi square | Chi square 

1/30 vs. 5/30 0.08 0.04 0.20 0.23 
5/30 vs. 9/30 0.22 0.22 0.37 0.42 
10/30 vs. 15/30 0.18 0.20 0.30 0.37 
1/20 vs. 5/20 0.06 0.08 0.18 0.08 
5/20 vs. 9/20 0.18 0.18 0.31 0.22 
1/10 vs. 2/10 0.53 0.52 . _—— 0.55 
2/10 vs. 5/10 0.14 0.17 0.35 0.17 


We realize that a small difference between probability values is of 
little significance. The first two methods, C.R. and uncorrected chi 
square, seem to yield comparable results; with the results of the t-test 
sometimes similar and often quite different. Furthermore, the corrected 
chi-square test always yields results closer to a p of 1.0 than the first 
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two methods. Our question is: is one of the 4 methods preferable to 
the others for this type of data when N is small? 


Yates has showed (Journal of the Royal Statistical Socicty 
ANSWER: Supplement, I, 217-235, 1934) that, for samples with a 

single degree of freedom drawn from a binomial popula- 
tion, a simple correction for continuity gives a satisfactory approxima- 
tion to the continuous value of chi-square. Since your samples are 
small, the use of corrected chi-square is, on the average, a better ap- 
proximation to continuous chi-square than the other methods described. 

As for the normal approximation (your first method), Student found 
Biemetrika, VI, 1-25, 1908) that this consistently underestimates the 
probability in the regions ordinarily used for rejection. Your first 
column would make this evident were it not for mistakes in computation, 
coupled with the fact that your probabilities are large. 

The quantity which you have called ¢ does not follow the Student— 
Fisher distribution. 

Since the tabulated ¢ with infinite degrees of freedom (the normal 
distribution) is the same as that of chi with one degree of freedom, the 
identical probability can be read from the table of chi-square and from 
the (two-tailed) table of the normal distribution if ¢ is calculated as 
follows: 


Pi — 
V 2pq/n 
Here, p = (p, + p2)/2 and vn is the size of each sample. 
It happens that testing the null hypothesis is not particularly in- 
teresting in any of your examples because P is greater than 0.05 in all 


of them. I took mortalities 1/20 vs. 6/20 and calculated these prob- 
abilities: 


(1) Normal Approximation, P = 0.028 
(2) Chi-square (or ¢ correctly calculated), P = 0.038 
(3) Corrected Chi-square, P = 0.096 


For interest, I also calculated the exact test (R. A. Fisher, Statistical 
Methods for Research Workers, section 21.02) and got P = 0.046, which 
happens in this sample to be nearest the probability turned up by un- 
corrected chi-square. In some of your samples the exact test is as easily 
calculated as chi-square. In others, having a small expectation in one 
cell of the contingency table, a device given by Yates yields close approxi- 
mations. This device is described in connection with Table VIII in 
Statistical Tables by Fisher and Yates. 


f 
ioe 
i 
| 
| 
| 
| 
| 


THE BIOMETRIC SOCIETY 


Regional Officers. The following results of regional annual elections 
for 1950 should be recorded: Australasian Region—Vice President, Dr. E. 
A. Cornish; Secretary-Treasurer, John Keats. British Region—Vice 
President, Prof. R. A. Fisher; Secretary, Dr. D. J. Finney; Treasurer, Dr. 
A. R. G. Owen. Eastern North American Region—Vice President, Dr. 
Joseph Berkson; Secretary-Treasurer, Prof. Walter T. Federer; Com- 
mittee term 1950-1952, Dr. W. J. Youden, Lila F. Knudson. Indian 
Region—Vice President, Dr. P. C. Mahalanobis; Secretary, Dr. C. 
Radhakrishna Rao; Treasurer, Anakul Chandra Das. 

Proceedings of a Biometrical Clinic on Entomological Problems. One 
of the objectives of the Biometric Society is the “dissemination of 
effective mathematical and statistical techniques”. A popular means of 
implementing this objective takes the form of a jointly sponsored meet- 
ing with a biological society. In several cases the program has con- 
sisted of questions answered informally by a panel of biometricians, the 
questions being submitted by the biologists. The proceedings of such a 
joint session of the American Association of Economic Entomologists 
and the Eastern North American Region of the Biometric Society held . 
in December 1948 were recorded electronically. After the proceedings 
were transcribed, edited and reviewed by a number of entomologists 
and professional biometricians, the Council agreed to their publication 
as an experiment, on a self-supporting basis. Announcements were 
included in a general mailing to the members of the two sponsoring or- 
ganizations and resulted in 98 prepublication orders from the Biometric 
Society and 489 from the Economic Entomologists. 

In consequence 700 copies were ordered and the bound proceedings 
of 64 pages were published in February 1950. At the price of 50 cents 
per copy for members of the two organizations and 75 cents to others 
it was hoped that the project would pay for itself. The production costs 
($275.00) were estimated correctly but we underestimated the time 
required in the Secretary’s office for handling the project and did not 
allow for unpaid orders. About 75 copies now remain and it is hoped 
that their sale will further reduce the deficit. 

However, as one of the direct consequences of the undertaking, the 
Society has enrolled 15 new members from among the Entomologists, 


-more than double the number previously listed in that biological field. 


300 


3 
4 | 
i 
j 
RS 
’ 
j 
4 
| 
| 
| 
¥ 
| 
aM 


EXPERIMENTAL DESIGNS 
(1) A SURVEY OF TYPES OF EXPERIMENTAL DESIGNS 
GERTRUDE M. Cox 
Institute of Statistics, University of North Carolina 
Abstract 


pinion a piece of research in any field involves a certain order of 
procedure. This will in general have three parts (1) a statement 
of the objectives, (2) a description of the experiment covering such 
points as the selection of experimental treatments, decision regarding 
accuracy of measurements, selecting the experimental units, deter- 
mining the general condition under which the test shall be made and 
specifying the experimental design and (3) an outline of the method of 
the analysis of the results. 

Methods for increasing the accuracy of experiments may be classified 
into three types: (1) increasing the size of the experiment, (2) refining 
the technique and (3) handling the experimental material so that the 
effects of variability are reduced. This may be done by careful selection 
of the material, by taking additional measurements that provide in- 
formation about the material or by skillful grouping of the experimental 
units into an efficient plan. 

Using the ideas of confounding and grouping of the experimental 
units as the criteria, the various types of experimental designs have 
been classified. The following types of designs were presented, illus- 
trated and discussed: (1) Complete block designs including completely 
randomized, randomized blocks, latin square and cross-over designs. 


TABLE 1—ANALYSES OF FIELD EXPERIMENTS, 1942-1948, 
NORTH CAROLINA AGRICULTURAL EXPERIMENT STATION, 


Number of Analyses 

Soy- Pas- | Pea- | Pota-| Cot- | To- | Small Total} Per- 
beans | Corn | ture | nuts | toes | ton | bacco|Grain|Others}; No. | cent 

Completely 
randomized 22 25 12 59 0.9 
Randomized blocks | 210 419 665 502 260 362 637 298 558 3911 | 61.9 
Latin Square 23 1l 48 9 91 1.4 
Split-Plot 122 142 388 50 36 108 70 212 17 1245 | 19.7 
Simple lattice 48 42 54 7 39 43 16 249 4.0 
Triple lattice 39 179 5 6 49 5 21 2 306 4.9 
Quadruple lattice 59 59 0.9 
Balanced lattice 6 45 79 16 3 19 39 19 3 229 3.6 
Rectangular lattice 23 23 4 
Lattice square 122 7 2 14 145 2:3 
Total 425 972 |1132 667 360 563 874 607 717 6317 |100.0 
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(2) Incomplete block designs as balanced incomplete block, balanced 
lattice, incomplete latin square and lattice square designs. Some 
partially balanced designs were presented with special emphasis being 
given to the split-plot designs. 

It is impossible to give many general rules which will be helpful in 
selecting designs. Each experimental situation presents its limitations 
to be considered. A good working rule is to use the simplest design 
that meets the needs of the experiment. This is not to say that the 
more complex designs will be used only rarely. Table 1 gives a sum- 
mary of the extent to which the different types of designs were used 
during 1942-1948 at the North Carolina Agricultural [Experiment 
Station. It is noted that almost 62 per cent of the 6,317 analyses 
were from field experiments arranged in randomized blocks. 

Table 2 gives a summary of the relative efficiencies of incomplete 
block as compared with randomized complete block designs for 1011 
analyses of soybean, corn, pasture, peanuts, potatoes, cotton, tobacco 
and small grain experiments conducted from 1942 to 1948 at the North 
Carolina Experiment Stations. There is an over all pooled gain in 
efficiency of 23%. This means that four replications of an incomplete 
block design, on an average, are about as accurate as five replications 
of a randomized block design. This means more efficient use of ex- 
perimental land and labor. 


TABLE 2—RELATIVE EFFICIENCIES OF LATTICE DESIGNS AS COMPARED TO RAN- 


DOMIZED COMPLETE BLOCK DESIGNS. 


| 
k= 3 4 5 6 7 8 9 10 ll Total 
No. 

Simple No. |68 92 42 23 15 5 4 249 
Lattice % 110 115 109 115 119 129 127 

Triple No. |15 38 60 131 32 14 13 2 306 
Lattice % 110 119 129 127 127 128 139 104 

Quadruple Me. 17 52 59 
Lattice % 131 

Balance No. |87 103 38 1 229 
Lattice % 111 128, 114 131 

Lattice No. 10 69 54 11 1 145 
Square % 145 142 139 139 154 


t=7xX8 
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EXPERIMENTAL DESIGNS 
(2) MULTIVARIATE EXPERIMENTATION 


M. H. QuENOUILLE 


Marischal College, Aberdeen, Scotland 


INTRODUCTION 


HE USE OF several variables in experimentation is by no means 
uncommon. Thus, in field trials the weight of grain and straw 
may both be measured; in growth experiments, different measurements 
of animals may be taken over a series of weeks; or in infection trials, 
the factors believed to influence resistance may be observed. In each 
of these examples, several concomitant variables are measured and used 
in the subsequent analysis of the experiment; each contributing in some 
part to the final estimates and conclusions. 
The employment of several variables in this manner presents prob- 
lems of interpretation, design, analysis and significance, many of which 
are still unsolved. It is the purpose of the following note to review and 


extend the existing approaches to, and methods of, such multivariate 
experimentation. 


DEPENDENT AND INDEPENDENT VARIABLES 


The first step in the design or analysis of multivariate experiments 
is to decide which variables are to be regarded as dependent and which 
as independent. The independent variables may then be employed 
to reduce the unaccountable variation in the dependent variables using 
the analysis of covariance technique. The analysis of variance of the 
dependent variables, thus corrected, may then be used jointly to derive 
the linear function of them which is most sensitive to treatment 
differences. This is done by discriminant analysis. For example, the 
yield of straw may be employed to reduce the variability in grain 
yield if the straw yield reflects fertility differences but not treatment 
differences; or alternatively, if straw yieid reflects treatment differences, 
a combination of straw yield with grain yield might be used to test 
treatment differences. 

The decision whether to employ any particular variable as de- 
pendent. or independent or, in fact, at all, rests partly upon the extent 
to which each variable reflects treatment differences and partly upon 
the questions that the analysis is intended to answer. Obviously, if 
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we are interested in testing only the effect of treatment on individual 
observations, such as grain weight, then we use only one dependent 
variable at a time; but, if we desire to determine the manner in which 
the treatments act upon the observations, we might have several de- 
pendent variables. For example, in testing the difference between two 
diets, we may take several measurements, such as weight, height, and 
leg-length, and combine them to find the most sensitive index of dietary 
differences of the type tested. 

The choice of independent variables will naturally depend upon 
what is to be tested, but, if the main aim is to reduce the unaccountable 
variability, and to detect treatment differences the choice of inde- 
pendent variables will be restricted.* The covariance method will be 
applicable only if the dependent variable is more sensitive to treatment 
differences than the independent variable, and the residuals of both 
variables are correlated. Even when this is so, the method will be 
inappropriate if the correlation between the residuals is less than that 
between the residuals prior to the elimination of the treatment effects. 
For example, the following analysis was calculated for the observed 
weight increases, y, and food consumptions, x, of chickens receiving 
four different diets: 


Degrees Sum of Sum of Sum of 
of squares products squares 
Freedom of y of x and y of z 
Treatments 3 12650 15059 20589 
Residuals 32 20510 22273 77349 
Treatments + residuals 35 33160 37332 97938 
Treatments 
Variance ratio, F. 6.58 2.84 


Correlation between “residuals” = 0.559. 
Correlation between ‘‘treatments + residuals” = 0.655. 


In this example since the weight increases appear to be more sensitive 
than food consumptions to differences in the diets, and the residuals 
are highly correlated, it might be expected that the elimination of the 


*The need for caution here is emphasized by Mr. M. Healy in the discussion that follows. How- 
ever, I disagree with Mr. Healy’s remark that an independent variable must be altogether independent 
of treatment effects. Thus it is a common procedure to adjust the measurements in growth experiments 
for initial weight. If, however, a situation arose in which the only available weights were taken one 
day after the commencement of the experiment, it would still be possible to adjust using these observa- 
tions. The question then is where to draw the line. Naturally, great caution must b> exercised in 
choosing such independent variates, and adjustment for weights taken one week after the commence- 
ment of the experiment may be very misleading. 
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effects of food consumption from weight increases would give a more 
However, since this correlation 
is less than that observed before the elimination of treatment effects, 
this expectation is not realised and the covariance analysis gives us: 


sensitive index of dietary differences. 


Degrees Sum of 
of squares Mean Variance 
Freedom of y square ratio F. 
Treatments 3 4834 1611 3.54 
Residuals 31 14096 454.7 
Treatments + residuals 34 18930 


Evidently the variable should not be treated as an independent variable 
in this analysis. If, however, the sum of products for treatments had 
been 5059 instead of 15059, the correlation would have been reduced 
to 0.480 and the variance ratio would have become 8.38; x could then 
have been treated as an independent variable. 

Thus it is seen that, when it has been decided precisely what has 
to be tested, it is still necessary to decide whether particular variables 
should be treated as dependent or independent. This may be done by 
tests similar to that indicated above. 


THE FORM OF ANALYSIS 


When it has been decided which variables are to be regarded as 
dependent and which as independent then the analysis may be carried 
out in two stages. First, the sums of squares and products for the 
“residual” and “residual + treatments” should be used in a covariance 
analysis to eliminate the effects of the independent variables. The 
degrees of freedom will correspondingly be reduced by the number of 
variables eliminated. Secondly, the sums of squares and products, 
thus corrected, may be used in a discriminant analysis as described 
by Fisher (1948). The coefficients of the ‘‘most-sensitive” linear func- 
tion may be evaluated exactly using the method of divided differences 
or, by trial and error, minimising the ratio of the residual sum of squares 
to the residual + treatment sum of squares. Alternatively, we may 
calculate the coefficients by successive approximation. This approach 
is useful since it allows the effect of extra variables to be examined and 
a possible method will now be described. 

Suppose the residual or “within” sum of squares and products are 
given by W,, and the residual + treatment or “total” sum of squares 
are given by 7,;;. Let W;;/T;; = R;; and suppose R,, < Ra < Rss 
etc., then z, is the most sensitive individual variable, but in general 
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zi = 2, + a,x; , where a; = 2(Ri, 
more sensitive. Thus, zx, can be replaced by xj and a second approxi- 


mation calculated. Two points should be noted: 


(a) The convergence will be more rapid, if the units of x,2. are 
chosen to make the sum of squares of x, and sums of products 

-++ comparable in magnitude; 

(b) The condition of the last section for the covariance method to 
lead to a more sensitive test of treatments is equivalent to 


Ri; > 0 


— R;,) X sign T,, , will be 


The following examples will serve to demonstrate this method. 


Examples 


1) The analysis of the last section may be extended as follows: 


Sy? Sry Sx? 
Wi; 20510 22273 77349 
Ti; 33160 37332 97938 
Ri; 0.618516 0.596620 0.789775 


a2 = 2(0.618516 — 0.596620) = 0.044 
y =y + 0.0442 


Sy”? Sry’ 
Wij 22619 .77 25676 .36 
Ti; 36634. 82 41641.27 
Rij 0.61744 0.61661 
a2 = 2(0.61744 — 0.61661) = 0.0017 
=y + 0.04572 
Sy’? 
Wij 22707 .29 25807 . 85 
Ti; 36776 41807 .77 
Ri; 0.61744 0.61730 


a2 = 2(0.61744 — 0.61730) = 0.0003 


= y + 0.04602 
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The changes in R,, are very small and evidently further approximation 
would not change the discriminant function appreciably. 

2) As a second example, weight increases over a longer period were 
analysed for this experiment. There were three variables to be con- 
sidered: y, the weight increases; x, , the food consumed during the 
last two weeks; x, , the food consumed during the first three weeks. 
The sums of squares and products of these variables were 


Sy? Say Sroy 
Wi; 69224 78148 9953 
Ti 107292 122119 32447 
Ri; 0.64519 0.63993 0.30675 
2 Sz 1%2 
Wij 189031 58545 
Ti; 253270 93670 
Ri; 0.74636 0.62501 
Sx 2 
Wi; 89626 
Ts 117605 
Ry; 0.76209 


Since R;,;R;; — R;;? > 0 for all i and j, the three variables could be 
considered as dependent. The analysis then proceeds as previously: 


a, = 0.01, a; = 0.68 
y =yt+0.72 
Sy’? Say’ Szzy’ 
Ws; 127075 119130 72691 
T's 210344 187688 114770 
Ri; 0.60413 0.63472 0.63336 
a, = —0.06, az; = —0.06 


y” 0.062; + 0.6422 
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Srey" 
Wii 105481 104275 63801 
T 3 176059 166872 102094 
Rij 0.59912 0.62488 0.62492 
a, = —0.05, az; = —0.05 


=y — 0.117, + 0.5922 


Ss 12 
Wii 89663 
Ti 150558 
Rij 0.59554 


The calculation may be stopped when the desired degree of accuracy is 
obtained or, at any stage, convergence may be accelerated by examining 
the changes in the coefficients. The exact discriminator, in this ex- 
ample, should be y — 0.192, + 0.49zx, and for this linear function, 
R,, = 0.59245. 


TESTS OF SIGNIFICANCE 


It has already been indicated above that the degrees of freedom of 
W,; and T;; should be reduced by the number of independent variables 
eliminated. Where a discriminant analysis is carried out the effect is 
more difficult to estimate, but may be approximately gauged by de- 
creasing the degrees of freedom of W;; , but not 7T;; by one less than 
the number of dependent variables. Thus, if 7;; and W;; have n and 
n — p degrees of freedom respectively and g dependent variables are 
used, the revised degrees of freedom for R;; will ben — p — gq + 1.* 
Alternatively, Bartlett’s (1938) test may be used and — {n — 3(p + 
q + 1)} log, R,, may be tested as x’ with p + q — 1 degrees of freedom. 
This test will usually be less accurate than the above test. 


*This will be an exact test for either p or ¢@ equal to one. 
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If we carry out these tests for the above examples, we get the fol- 
lowing analysis for example 1: 


| 
Degrees of Mean | Variance 
Freedom Sy? Square | ratio 
Tu — Wu 4 14070 3517.5 +.80 
Wu 31 22707 732.5 
Tu 35 36777 


— {35 — 4(3 + 2 + 1)} log, 0.61744 = 15.43 is a x” with 4 degrees of 
freedom. 

Both of these tests indicate that P lies between 0.01 and 0.001, 
but it is apparent that there is no increase in accuracy from the in- 
clusion of z. This can be tested using Bartlett’s test, since x” with 3 
degrees of freedom prior to the inclusion of zx is approximately 
— {35 — 3(3 + 1+ 1)} log, 0.61852 = 15.61. The difference, which 
is uegative owing to the approximate nature of the test, indicates that 
no improvement has resulted from the inclusion of x. The improve- 
ment in accuracy might also be tested using the original analysis of 
variance. The appropriate residual sum of squares for this is 33160 X 
0.61744 = 20474, so that the analysis is as follows: 


Degrees of Sum of Mean 
freedom squares square 
Original treatments 3 12650 4217 
Improvements due to including x 1 36 36 
Treatments 4 12686 
Residuals 31 20474 660 
Total 35 33160 


The improvement is, as before, insignificant. For example 2, the 
corresponding analysis is: 
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Degrees of Sum of Mean Variance 
freedom squares square ratio 
Original treatments 3 38068 12689 
Improvement due to in- 
cluding x; and r2 2 5327 2664 1.25 
Treatments 5 43395 8679 4.07 
Residuals 30 63897 2130 
35 107292 
xis) = — {35 — 33 +3 4+ 1)} log, 0.59554 
= 16.33 
= —{35 — +14 1)} log, 0.64519 
= 14.24 
Improvement due to including x, and z, , xi2) , = 2.09. 


The two forms of analysis are again in substantial agreement; the 
value of P being just below 0.01 and the improvement due to including 
x, and x, being barely above expectation. 

The form of analysis indicated in the previous section is in conse- 
quence very useful for rapid investigations of the variables relevant 
to any enquiry. For this purpose the degree of approximation that is 
required may be easily gauged. Thus, in the final approximation of 
example 2, R,, is obviously subject to an error of not more than 0.01 
and the treatments sum of squares is therefore at most 1073 too small 
(actually 332). This difference is not of any importance, but, if it 
had been, a closer approximation could have been taken. 


EXPERIMENTS WITH FIXED TREATMENTS 


When the treatments are fixed in their application throughout all 
phases of the experiment, as in a growth experiment, the results may 
be analysed jointly or separately. Thus, for example, the total growth 
may be analysed or alternatively, the effect of the treatments on the 
growth curve may be investigated, depending on which aspect is re- 
garded as important. Bartlett (1947) has given an example of the 
multivariate analysis of estimated parameters of a growth curve and, 
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similarly, an analysis may be carried out on the direct observations as 
in the following example. 


Example. 


Forty pairs of littermates (mice) were used in an experiment to 
investigate the effect of a dietary supplement. The weight increases 
were observed two, four and six weeks after weaning and the differences 
between pairs, x, , 22 , 2; were calculated for each period. The sums, 
sums of squares and sums of products were as follows: 


i\j 1 2 3 Ez; 
1 501.20 174.80 84.52 54.8 
2 243.40 287 .08 26.2 
3 742.68 5.0 


The mean difference was largest after two weeks although it was not 
obvious that the subsequent differences could be completely ascribed 
to this large initial effect. A discriminant analysis might therefore be 
carried out to test whether the second and third observations give 
further evidence of treatment effects not contained in the first ob- 
servation. This may be most easily executed by calculating the re- 
gression of a dummy variate on 2,7,x; , and consequently, the dis- 
‘criminant coefficients from the equations. 


501.200, + 174.80a, + 84.52a; = 54.8 
174.80a, + 243.40a, + 287.08a; = 26.2 


84.52a, + 287.08a, + 742.68a, = 5.0 
Using successive eliminations, we get 
182.436a, + 257.603a; = 7.088 
257.603a, + 728.427a, = —4.242 
364.688a, = —14.249 
a3 = — 0.03907 
a, = 0.09402 
a, = 0.08314 


The significance of the further observations may then be tested as 
follows: 
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Degrees of Sum of Mean 
freedom squares Square 
: (54.8)? 
Treat t riod 1 ees = 5. 
reatments (peri ) 1 501.20 5.992 
Treatments (period 2, period 1 (7.088)? 
effects eliminated) 1 = 0.275 
Treatments (period 3, period 1 (—14.249)? 
and 2 effects eliminated) 1 ra = 0.557 
Residual 37 33.176 0.897 
Total 40 40.000 


There is apparently little to be gained by using the additional observa- 
tions from the second and third periods. 


REVERSAL EXPERIMENTS AND EXPERIMENTS WITH ROTATING FACTORS 


The reversal experiment is usually analysed by comparing quad- 
ratic or cubic components of the two treatment groups. For example, 
if A,B,A;B, and B,A,B;A, are the two treatment groups, —A, + 
3B, — 3A; + B, and —B, + 3A, — 3B; + A, are compared. By 
this means it is hoped to eliminate trend and to deal only with observa- 
tions which are sensitive to treatment differences. However whether 
such linear functions are the most sensitive will depend upon the form 
of trend and the stability of the variance from one phase to the next. 
For many purposes, it will not matter whether the function used is the 
most sensitive, but it is possible to carry out a more extensive analysis 
using discriminant functions. This is most conveniently carried out 
by the regression method as above, but in the following example the 
method of successive approximation has been used, since this may 
generally be applied to experiments with rotating factors. 


Example. 


The weight increases, x, , 2 , 3 , were observed for forty-eight chick- 
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ens on a reversal experiment. These were used to calculate y, = 23 — 


+21, Y2 = — Ys = which were then used in a discriminant 
analysis as follows: 


Syiy2 Syiys 
Wi 48205 —18315 2242 
Ts 53124 — 20857 2726 
Ri; 0.90741 0.87812 0.82245 
Sy? Sy2ys 
Wi; 9912 —2060 
T 3; 11226 — 2310 
Ri; 0.88295 0.89177 
Wi 3018 
Ti; 3066 
Ri; 0.98434 


The quantity y. is obviously more sensitive than y, and inspection of 
the values of R,, and R,,; indicates that little is to be gained from the 
inclusion of y, and y; in the discriminant function. We may however 
test this by calculating the first steps of the analysis. 


a, = —0.01, as = 0.02 


= yo — 0.0ly: + 0.02y; 


Sy’? Sy2'ys 
Wi 10201 .03 —18752.21 —2022 .06 
Ts; 11556 .19 —21333 .72 —2275 .94 
Ry 0.88273 0.87899 0.88845 
a, = —0.01, a3 = 0.01. 


Since these values do not give rise to greatly changed values of R;; , we 
might try 


= y2 — 0.03y, + 0.06y; 
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Sy! 
Wi; 10809. 88 — 19626 .63 —1946.18 
12249 .26 — 22287 .16 —2207 .82 
Ri; 0.88249 0.88062 0.88149 


The inclusion of y, and y, leaves the value of R,. virtually unaltered. 
As previously this could be tested using the analysis of variance: 


Degrees of Sum of Mean 
freedom squares square 
Treatments (y2) 1 1314 
Treatments (y; and ys) 2 5 2.5 
Total Treatments 3 1319 
Residual 44 9907 225 
Total 47 11226 


Of more interest in this case is whether the inclusion of y, and y; ; 
after y, has been used, increases the discrimination significantly. To 
test this, we get the analysis: 


Degrees of Sum of Mean 
freedom squares square 
Treatments (y:) 1 4919 
Treatments (y2 and ys) 1324 662 
Total treatments 3 6243 
Residual 44 46881 1065 
Total 47 53124 


Evidently y, and y; do not cause a significant reduction in the residual 
sum of squares, and y, would be sufficiently sensitive for most pur- 


poses. 


The above example also demonstrates a possible approach to the 
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general rotation experiment. Thus, if the three cycles, A, B, C, A --- 
B, C, A, B --- C, A, B, C, --+ are being used, analyses may be carried 
out testing the treatment differences in each phase and these may then 
be combined to give a joint test of treatment differences. As previously, 
where the means and variances are reasonably stable, we may derive 
little more from this approach than from the usual least-squares analysis. 


RESIDUAL EFFECTS 


The experiments with rotating treatments considered in the last 
section do not allow any residual effects to be estimated. For an experi- 
ment with three treatments, the residual effects of A, B and C cannot 
be distinguished from the main effects of B, C and A. Thus, if the 
residual effects are likely to be of importance an extended design must 
be used in which several cycles are used. For example, for three 
treatments, the cycles A, B, C, and A, C, B might be used in all phases 
so that the first residuals can be differentiated from the main effects. 
If Az , Ac , etc., represent the total of treatments A, which follow 
treatments B, C, etc., then the main effect of A may be estimated by 


(Ags + Ac — Be — Cs)/3 
and the residual effect of A by 
(Ba + C4 Be C;z)/3. 


If, therefore we carry out analyses at each phase and use these to 
discriminate between Az , Ac , Ba , Bc , Ca and Cz , the resulting 
discrimination function can be used in conjunction with the above 
formula to estimate the main and residual effects. 

However, the use of such a function assumes that the main and 
residual effects are influenced in a similar manner as the experiment 
proceeds. This is very unlikely and it would seem preferable to calcu- 
late discriminants for the main and residual effects separately. This 
may be done by calculating the components due to main effects (residual 
effects eliminated), and residual effects (main effects eliminated) and 
using these in a discriminant analysis. For this purpose, since the 
component due to residual effects (main effects eliminated) is dummy 
during the first phase, this should be used as an independent variable. 

The analysis suggested above is only one of several possible ap- 
proaches to the analysis of long-term experiments, but, where many 
years are spent collecting experimental data, it does not seem un- 
reasonable that extra time should be spent on its analysis. 

Finally, it must be noted that the above applications of multi- 
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variate analysis by no means exhaust its possibilities in the analysis of 
long-term experiments. Where several materials or crops are involved 
in a cycle, different treatments or treatment cycles may be compared 
for each crop separately or may be combined in an overall analysis to 
determine and test the manner in which the treatments act. Such 
analyses may appear to be extravagant, but if they throw further light 
upon the treatment behaviour their use will be amply justified. 
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DISCUSSION ON EXPERIMENTAL DESIGN 
(1) A SURVEY OF TYPES OF EXPERIMENTAL DESIGNS 


GERTRUDE M. Cox 


A. Hald. My comments will treat only a very small part of the ground covered by 
Professor Cox and they will be of a more theoretical nature. 

Experimental results are obscured by the effect of two types of disturbing factors: 
(1) the effect of factors displaying an unknown systematic variation such as the fertility 
of the soil, and (2) the effect of factors varying at random as for instance errors of 
measurements. The total experimental error is composed of variations originating 
from these two sources. 

A fundamental problem in most experiments is, therefore, to eliminate as far as 
possible the effect of systematic factors from the experimental error. Two means are 
available for this purpose: the design of the experiment, i.e., the lay-out of the plots 
and the allocation of the treatments to them, and the statistical technique used in the 
analysis of the results. 

The most commonly used designs such as the randomised block, the latin square, 
etc. are based on a combination of systematic and random arrangements of the treat- 
ments. In this manner part of the fertility variation is eliminated from the experi- 
mental error by means of the systematic structure of the experiment and the analysis 
of variance. The remaining variation in fertility is, however, included in the experi- 
mental error, and due to randomisation it has become a random variation. Here the 
“analysis of variance” is taken in its usual sense as the application of the method of 
least squares under the assumption that the fertility variation may be estimated or 
described by a step-function, which, for example, in the randomised block experiment 
takes on a new value for every block, but is constant within a block. This method is 
often unsatisfactory for a large number of treatments because the plots within a 
block are not even approximately of the same fertility. This fact has led to new de- 
signs containing smaller blocks such as the incomplete block designs. 

Another possibility for solving this difficulty does not seem to have been fully 
explored, namely that of eliminating the fertility variation by using another statistical 
technique, for instance, regression analysis. It would be interesting to know how 
much could be gained in efficiency by describing the fertility variation in a randomised 
block experiment by an adequately chosen function. Naturally this depends partly 
on the fertility variation itself, the gain being largest in cases with large fertility 
variation. The additional computational labour required by using, for instance, a 
polynomial instead of a step-function will be considerable. 

If it can be shown, however, that it is possible to eliminate nearly all the fertility 
variation by a regression analysis it follows that the arrangement of the treatments 
within the blocks is immaterial. We may then give up the randomisation and choose 
an arrangement such that the computations will be relatively easy. In 1929 Neyman 
proposed using a polynomial for describing the fertility variation in experiments, where 
the treatments are arranged in the same order in every block and where the blocks are 
placed in succession. In a paper in 1948 I developed a technique using orthogonal 
polynomials for the same purpose. In about 20 actual field experiments where I 
have applied this method it seemed to work satisfactorily. The fertility variation 
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was described by polynomials of up to the fifth degree and the residual variation 
seemed to be random. In the same paper I explored yet another statistical technique 
and tried to describe the fertility trend by a simple moving average. The trend was 
then eliminated by subtraction and a suitably modified analysis of variance of the 
residuals was used to estimate the treatment means and the experimental error. This 
method also seemed to work quite well and required much less arithmetic than the 
regression analysis. What little practical experience I have leads me to hope that 
this last method may prove to be a valuable one. 

The next step must obviously be to compare the results of the various techniques 
suggested in a large number of widely different practical situations. 

H. Astrand. In the twenties we used systematic plot arrangements with moving 
averages such as were mentioned by Professor Hald. Following Professor Fisher’s 
visit the latin square became the standard design, and in about 20,000 field trials 
carried out in the thirties perhaps 80% were latin squares. Randomized blocks are 
now the most common because they are more convenient for mechanical work and 
harvesting. More complex designs are used in breeding work, especially in testing a 
great number of strains. For general research we prefer very simple designs that the 
farmer himself can carry through. Errors of measurement are small compared with 
the soil variation not eliminated by the design. The correlation between residual 
variance and mean yield in 1,200 latin squares carried out in south Sweden was 0.45. 
Interactions are also much larger on poor soils. Insofar as designs must differ on 
soils of different fertility, it is not correct to weight the experiments inversely as their 
variances because this underrepresents the poor soils in the average. We now study 
trends of soil fertility by soil analyses and try to use simple systematic designs on 
such prestudied soils. 

M. P. Schiitzenberger. Dans certains types d’expériences les observations sont 
d’un type non paramétriques—en psychologie par exemple. Les schémas corre- 
spondant aux “incomplete block designs” peuvent étre avantageusement utilisés pour 
la comparaison des stimuli car s’ils permettent comme la méthode des comparaisons 
par paire de tester les interactions ils ont en outre l’avantage d’étre aisément utili- 
sables. 

R. A. Fisher. The method of fitting polynomials in one or two dimensions is not 
so new or unexplored as Dr. Hald suggests. For many years Professor van Uven in 
Holland was exploring these methods: and in the period 1912-1925, before I under- 
stood fully the principle of randomisation, I frequently tested, in discussion with my 
friend “Student” (W. S. Gossett), what could be done with field trials by these means. 
Of course, the principal difficulty we encountered, once the labour of such analysis 
had been overcome, was that fitted polynomials in two dimensions might easily 
absorb so many as 20 or 30 degrees of freedom without removing a corresponding 
proportion of the residual sum of squares. 

Other participants in the discussion included G. Rasch and F. Bernstein. 


(2) MULTIVARIATE EXPERIMENTATION 


M. H. QuENOUILLE 


M. Healy. The question of choice of independent variate needs further discus- 
sion. A useful independent variate must adequately reduce the error variance; but 
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it must also be, not only less sensitive, but altogether independent of treatment effects 
if misleading results are not to follow. On this point, the analysis of variance by itself 
is by no means a reliable guide. For exainple in the case discussed it appears possible 
that the various diets differ in palatability. Correcting the weight increases for 
differences in food consumption may therefore lead to results different from those in 
which the experimenter is interested. 

The iterative approach to the best discriminant will be of great use in avoiding 
lengthy computation aimed at attaining the top of a flat maximum. In practice, 
however its importance can be exaggerated. In the experiment quoted, the experi- 
menter requires an answer to the questions, ‘Do these diets give different weight 
increases and if so, by how much?” Even the first is not adequately answered by 
stating that the treatments affect significantly a linear compound of weight increase 
and food consumption while the second part is left unanswered altogether. If a 
linear compound is meaningful, the experimenter himself is often in the best position 
to specify the weights. 

M.S. Bartlett. While we would probably all agree with the cautions expressed 
by Mr. Healy, there are many situations for which discriminant analysis is required 
and it is then important to use the simplest efficient function. There is no point in 
using complicated functions, as has sometimes been done, if these are not superior to 
simpler ones. Here the methods described by Mr. Quenouille for examining the 
effect of introducing further variables should be of value. 

Frank Yates. Summing up, the chairman stated that the papers and discussion 
indicated the importance of the subject and the progress that had been made in the 
last thirty years. He was a little disappointed that there had been no discussion of 
the problems arising in long-term agronomic experiments and animal experiments 
with alternating treatments. He commended these problems to the attention of 
biometricians. 

From his own experience he would like to emphasize the importance of objectivity 
in the methods of analysis. This was one of the most important and fundamental 
contributions to the subject made by Professor Fisher. Once the experimenter—or 
the biometrician—permitted himself the liberty of selection from among a number of 
alternative methods of analysis, the danger of influencing the results in- the desired 
direction arose. Simplicity in the analysis was also of the greatest practical import- 
ance. 

Rothamsted experience had indicated the extreme importance of factorial de- 
signs. Recent developments in this direction, in particular the use of a single replica- 
tion or a half or third replication (fractional replication) with the estimation of error 
from high order interactions, and also split-plot confounding, have proved of the 
greatest value in providing designs for the investigation of an increased number of 
factors in experiments with relatively few experimental units. He would commend 
factorial designs to the attention of experimenters working on animals and human 
beings who had so far failed to take advantage of the possibilities of this type of 
design. 

Whether in agricultural field trials a large number of simple experiments was 
better or worse than a small number of more elaborate experiments appeared to have 
no single answer—it was a balance between advantages and disadvantages. How- 
ever, a great part of the cost of an experiment on an ordinary farm was unaffected by 
the size of the experiment and it was therefore often worthwhile to have experiments 
sufficiently large to include a number of factors. 

Other participants in the discussion included G. Rasch and F. Bernstein. 
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BIOMETRICAL ASPECTS OF BIOLOGICAL ASSAY 


(1) BIOLOGICAL ASSAYS WITH SPECIAL REFERENCE TO 
BIOLOGICAL STANDARDS* 


J. O. Irwin 


Medical Research Council’s Statistical Research Unit 
London School of Hygiene and Tropical Medicine 


Abstract 


HE PAPER contains seven sections. Section I gives a historical 
account of the development of biological standards from the time 
of the Budapest International Congress of 1894, when Roux and his 
colleagues reported on the efficacy of diphtheria antitoxin, until the 
London W.H.O. Conference of 1949. It includes Hartley’s Table of 
International Biological Standards. There is a brief reference to the 
history of Statistical Methods; advances in the design of tests are the 
most noteworthy advance since 1937. 

Section II discusses general ideas. If we are given a standard 
there is no difficulty in defining a unit. The unit is defined as the specific 
biological activity of a given amount of the standard. It cannot be 
defined as the given amount itself because we may want to assay against 
the standard substances which exhibit “the specific activity’? but are 
not necessarily in the same chemical form. “Specific activity” although 
somewhat tendentious is an unavoidable phrase. It has as its back- 
ground a working hypothesis which often has to be abandoned as more 
is learnt about the drug. A substance which initially has been re- 
garded as though it were a pure chemical compound has later often 
been found to be a mixture of several. The ideal thing is then to 
enable each of these to be assayed separately, either by biological or 


*This paper is published in full in the Journal of Hygiene Vol. 48, 1950. 
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preferably by physical or chemical means. When the constitution of 
each is known and they can be synthesised we are approaching the 
stage when the standard will be unnecessary. To make it unnecessary 
should be the ultimate aim of research. There is no difficulty in de- 
fining potency provided we are prepared to admit that it may vary at 
different levels of dosage or in tests with different species of animals. 
When this happens the definition is deprived of much of its practical 
utility, but the results are an indication that more fundamental re- 
search is required, until the situation is cleared up. 

Section III discusses Statistical Technique. There are subsections 
on the equivalence of various formulae for fiducial limits, on a new 
method for the combination of fiducial limits from individual assays 
to obtain fiducial limits of a pooled result, and on the x’ test in Probit 
Technique when the numbers are small. The true x’ distribution is 
obtained for the particular case of one animal on each of eleven equally 
spaced doses (—2¢(0.40)+2c), the 8.D. (c) of tolerances on the dosage 
scale being supposed known. This is found to be fitted satisfactorily by 
a Pearson Type VI curve. Both very large and very small values of 
x’ occur more frequently than in the 11 d.f. distribution based on 
normal theory. A general method is given for obtaining all the semi- 
invariants of the x’ distribution when the true mortalities are known. 
In most cases where the numbers of animals on each dose are small it 
would appear that a “normal theory” x’ distribution with a modified 
exponent is a satisfactory approximation to the distribution. 

Section IV outlines a method for applying probit-technique when 
the test can be arranged so that one member of each of a number of 
litters can be put on each dose. Section V discusses shortly the relative 
value of probits, logits and the angular transformation. Section VI 
compares a number of miscellaneous rapid methods for the quantal 
case, of which Kiirber’s method is on the whole the most useful. 

Section VII, the concluding section, gives an account of the inter- 
national co-operative test of the new standard for Vitamin D. 

The paper ends by emphasising how great is the debt which is due 
to the pioneers who succeeded in getting standards established, people 
like Dale, Gautier, Gaddum, Hartley and Trevan, thereby enabling 
many of the new discoveries of medicine to be utilised on a comparable 
basis throughout the world to the immense advantage of many thou- 
sands of sufferers; also to the splendid contribution made to this end 
by the Biometricians and Statisticians, too many of whom were at the 
Conference to make it anything but invidious to mention them by name. 
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(2) THE ANALYSIS OF A COLLABORATIVE ASSAY 
OF THE THIRD INTERNATIONAL DIGITALIS 
STANDARD PREPARATION* 


W. L. M. Perry, M.D. 
National Institute for Medical Research, London, England 


Abstract 


S MANY of you are aware, digitalis consists of an extract of the 
leaves of the plant Digitalis purpurea. In this respect it is a 
comparatively simple biological preparation. Chemically, however, it 
is exceedingly complex and its exact nature is not yet elucidated. It is 
known, however, that there are present more than 25 glycosides, some 
of which, like digitoxin, have been prepared in a pure chemical form, 
others of which are of incompletely proven constitution. Moreover, 
two samples of leaf may contain widely different proportions of these 
25 or more glycosides and in some samples a few of them may be alto- 
gether absent. 

It is also known that there are often large variations in the potency 
ratio of two samples of digitalis when the test animal is changed, or 
when the conditions of the assay in any one species of test animal 
are changed. In the first case this can be attributed to a varying 
sensitivity of the animal species to different glycosides which are present 
in the two samples in different proportions and, in the second place, to 
quote one example, the different potency ratios obtained in the past 
between frog assays read at 2 hours and at 24 hours could be attributed 
to different proportions of a slowly-acting glycoside which produced 
its effects only after the end of the 2-hour period. 

In addition to these facts, neither the cat nor the frog is human, 
and neither the guinea-pig nor the pigeon suffers as far as I am aware 
from auricular fibrillation. 

These, then, are the major difficulties and they are indeed formid- 
able. In fact, the assay of such heterogeneous material must always 
be regarded as a makeshift—as a practical necessity in lieu of something 
better—and we must always beware of attaching too great a degree of 
precision or of definition to the final estimate of potency. 

When it became necessary to establish a Third International 
Digitalis Standard some time ago an attempt was made to minimise 


*The full report of the results and analysis of the collaborative assay, a summary of which was 
circulated to members of the Conference, will be published elsewhere. 
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these difficulties by arranging for the proposed Third International 
Digitalis Standard to be a blend of a number of separate samples of 
leaf; a similar blend had been used successfully in making the Second 
International Standard. In this way it was hoped to obtain a more 
nearly representative mixture with average relative proportions of all 
the glycosides. In order to detect species differences all the laboratories 
collaborating in the assay of the proposed new standard were requested 
to assay in at least two species of animal. At least one of these two 
assays was to be carried out by one of three recommended methods 
which were circulated to the laboratories, for cats, frogs and guinea- 
pigs respectively. In the recommended methods care was taken to 
ensure firstly randomisation of the animals with certain restrictions 
about weight and sex, and secondly, as complete control as possible 
of known sources of variation, such as speed of injection of dose, method 
of preparing dilutions, and the temporal changes of sensitivity in a 
stock of laboratory animals. In this way it was hoped to be able to 
estimate differences between laboratories using one standard method, 
as well as differences between the alternative methods of assay. 

I would like to consider now the question of assigning a potency 
value to the proposed Third International Digitalis Standard. The 
results of all the analyses, when combined, give a final estimate of the 
weighted mean potency of the proposed Third International Digitalis 
Standard in terms of the Second International Digitalis Standard of 
1.0527, with limits of error (P = 0.05) of 1.036 to 1.069. There are 
however, several other considerations to be taken into account in as- 
signing a definite potency level to the new standard preparation. 

Firstly Gold and his co-workers have shown that man is much 
more nearly a cat than a frog—in respect of his reaction to digitalis. 
The response used in man was the degree of improvement in selected 
cases of auricular fibrillation, and if this work is accepted as conclusive 
the potency ratio in the cat—shown in Table I—to which the potency 
ratio in the pigeon approximates, would be expected to reflect more 
exactly the expected potency ratio in man. 

Secondly, it would be convenient practically if the change to the 
new standard could be effected without making any change in the 
regulations. This could legitimately be recommended on statistical 
grounds only if the best estimate of the potency ratio, namely the 
combined weighted mean potency, did not differ significantly from 
unity. This does not hold in the present assay. 

Thirdly, there is no evidence at all to show that the standard 
preparation of digitalis is completely stable. By this I mean that a 
gross deterioration of the Second International Digitalis Standard over 


| 
{ 
eRe 


324. PROCEEDINGS OF INTERNATIONAL BIOMETRIC CONFERENCE 


TABLE I 
RESULTS OF COMBINATION OF ASSAYS DONE SEPARATELY FOR EACH METHOD. 


Limits of Error 


Method of No. of (P = 0.05) 
Assay Assays | Potency Vv. Weight} d.f. 
Actual Percentage 
Frog 12 1.0197 |0.977-1.064/95 8-104 11092 7.65 ll 0.7-0.8 
Cat 16 1.0874 |1.057-1.117|97.2-102.8) 25816 | 24.99 15 0.05-0.1 
GP. 6 1.0363 |0.997-1 .077|96 .3-103 13927 6.00 5 0.3-0.5 
Pigeon 7 1.0745 |1.037-1.113/96 .5-103 .6) 16539 5.11 6 0.5-0.7 
G.P. 
(Other Methods) 6 1.0336 |0.995-1.072/96 .3-103.8) 14654 | 19.12 5 |0.001-0.01 
Dog 1 0.8712 |0.775-0.978/88 .9-112.3) 1740 


a period of more than 10 years would almost certainly have become 
noticeable; a change of the order of 5-10% would equally certainly 
not be evident. I do not wish to labour this point, which is purely an 
academic one, since there is equally no proof that such deterioration 
does occur; moreover the precautions taken in the preparation and 
storage of standard preparations are designed to prevent it; and indeed 
it is a necessary and fundamental assumption in using standard prepa- 
rations to assume absolute stability. Nevertheless it may, perhaps, be 
of some slight significance that I can find no instance, in the small 
number of cases of which records are available, of a replacement 
standard preparation proving to be I¢ss potent than its predecessor. 
This will probably be due, in many|cases, to improvements in the 
methods of preparation of the new standard; but on the other hand 
it may be due to small reductions in the potency of the old standard 
which had occurred during the several years of its life. Thus, although 
we may be able precisely to define the potency ratio between the new 
standard at the beginning of its life and the old standard at the end of 
its life, we cannot be sure that the same ratio would apply if the old 
standard were also in the full flush of youth. I do not advance this as 
an argument in favour of not using the weighted mean potency as the 
best estimate of the potency of the new standard; I would, however, 
use it to illustrate the extreme difficulty of proving that any such 
estimate is exact; and to argue, therefore that undue elaboration of 
statistical technique is an unnecessary and unwarranted refinement in 
such cases. 

The final assignment of potency must await the decision of the 
Expert Committee on Biological Standardisation of the World Health 
Organisation, but such points as these will probably have some bearing 
on the question. It is to be hoped, therefore, that today’s discussion 
may throw some interesting light upon them. 
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DISCUSSION ON BIOMETRICAL ASPECTS 
OF BIOLOGICAL ASSAY 


C. 1. Bliss. In agreement with most investigators, both Doctors Irwin and Perry 
have restricted the term “biological assay’’ to experiments which would test the null 
hypothesis that the activity assumed for one preparation, the ‘“‘unknown’”’, does not 
differ from that observed for a second preparation, the “standard”. One further 
assumes that within the framework of the experiment the unknown is qualitatively 
identical with the standard. The latter hypothesis is a great convenience in design 
and analysis, even though we may know in advance that under other conditions the 
standard and unknown do in fact differ qualitatively. 

Given these requirements, it is clear that an experiment in which the response to a 
single preparation is measured at several dosage levels is not a biological assay, 
despite its key importance in assay design. Its purpose is to determine whether the 
preparation is active and, if active, to define its dosage-response relation. It involves 
only one major assumption, that the dose reaching the site of action is proportional 
to that applied and measured by the experimenter. 

True biological assays depend upon this and additional assumptions which may be 
used to group them under three headings with increasingly stringent restrictions. 

The first of these are comparative assays (Rasch’s term), which are of especial 
interest in research. In addition to the conditions for a dosage-response curve, a 
comparative biological assay requires that the response to both the standard and the 
unknown must be measurable in the same units, that the preparations must be 
compared within self-contained assays of measurable precision.and, in some cases, 
that the response to comparable doses of standard and unknown must not differ 
significantly. Comparative assays measure the potency of the unknown relative to 
the standard under specified conditions and determine whether this potency is quali- 
fied by the level of response. As an aid in selecting one from several alternative 
techniques, the inherent precision of each method should be reported. For this 
purpose it is convenient to use Gaddum’s \ = s/b (or 1/b in quantal assays), with 
which one can estimate the number of animals required for a given precision or vice 
versa. \ is needed quite apart from the fiducial limits at P = .95 or .99, used com- 
monly but limited to a specific design. 

Analytical biological assays (Finney’s term) form the second type, in which the 
active ingredients are presumably identical. These assays are concerned with bio- 
logical standardization and may make use of the techniques of quality control. 
Ideally, they involve two additional conditions that are seldom realized in practice. 
One is that if there are two or more active ingredients, their relative proportions must 
be constant in all preparations. The other requires that the relative potency must be 
entirely independent of the assay method, of the test organism or of the level of 
response. An analytical biological assay answers the question, ‘‘What is the potency 
of the unknown in units of the standard?” 

Analytical biological assays should be designed so that their inherent error can be 
measured. When this shows the requisite stability in repeated tests, past experience 
in respect to the slope and the standard deviation can be vtilized to minimize the 
error. One should also determine whether the net inter-assay error is negligible, as it 
seems to be in the digitalis assay, or whether it may be larger than the intra-assay 
error, as in the penicillin plate-assay. This cannot be decided a priori and it is an 
important function of collaborative experiments to determine the relative size of 
these two sources of experimental error. This is possible only when each component 
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assay provides an estimate both of relative potency and of its internal precision. 
From this viewpoint each collaborative experiment should follow a carefully specified 
design, so that in subsequent statistical analyses observed differences in relative 
potency need not be ascribed to obvious differences in assay techniques. 

The third type consists of the pass-or-fail assays. These have the sole objective 
of determining whether an “unknown” preparation of a given drug passes or fails a 
prescribed standard of potency. Because of its limited objective only the test of 
significance is of concern and such assays may differ materially from those requiring 
an estimation of potency. It is possible, for example, that a single dosage level may 
be preferred theoretically for the unknown. The assay process rather than the indi- 
vidual test is of primary concern. In addition to most of the assumptions underlying 
the other assays, consumers’ and producers’ risks or errors of the first and second kind 
must be established and a method of inspection specified rather exactly. Although 
analytical assays could be used for this purpose, a well-designed pass-or-fail assay 
may be much more efficient. The applicability of these techniques to Pharmaco- 
poeial purposes has yet to be explored in the light of modern developments in inspec- 
tion sampling. 

N.K. Jerne. The first biological substance for which an International Standard 
was adopted was diphtheria antitoxin, a substance contained in the blood-serum of 
animals that have been immunized with diphtheria toxin. In 1922 it was decided 
that 62.8 micrograms of a dried serum-preparation kept at the State Serum Institute 
at Copenhagen was to be the International Standard Unit of diphtheria antitoxin. 
The assay of diphtheria antitoxin is fairly simple, and the possibility of expressing the 
potency of different diphtheria antitoxic sera in International Units has never been 
seriously questioned. 

Nearly all assay methods are based on the following procedure: Diminishing 
amounts of antitoxic serum are measured into a series of test tubes and brought to 
constant total volume with saline. A constant amount of toxin is then added to each 
tube so that in the final series of tubes we have a constant concentration of toxin and 
a diminishing concentration of antitoxin. After all or part of the toxin has been 
neutralized by the antitoxin we can measure the remaining toxin concentration by 
injecting these mixtures into the skin of rabbits (or guinea-pigs). The response to a 
toxin injection is an inflammatory skin-reaction, the diameter of which is a function 
of the concentration of toxin injected. By plotting against these responses the log 
antitoxin dose, we obtain curves which:seem to be parallel and reach an asymptote 
corresponding to the response to the toxin concentration in a tube without antitoxin. 
This toxin concentration (corresponding to the fixed amount of toxin added to every 
test tube) can be varied arbitrarily and the assay can thus be carried out at different 
concentration levels, all of which should give us the same potency evaluation of a 
serum when tested against the standard. 

But this is not what we actually find. At very low concentration levels, the log 
dose/reaction curves are no longer parallel for different antitoxins, and the distance 
between the curves is quite different from the distance at higher concentration levels. 
There may be parallel, steep curves at a high concentration level, and unparallel, 
flatter curves at a low concentration level, for the same two preparations. In a single 
routine assay, the slopes of the curves will usually not be determined with a precision 
great enough to show significant departure from parallelism, but there remains the 
difference in distance between the curvesat different concentration levels. This proves 
that the active substances in the test serum and the standard serum are not the same. 

The logical result of these observations would be that the potency of an unknown 
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antitoxin preparation cannot be expressed in Standard Units. And as these consid- 
erations undoubtedly apply to other sera for which an international Unit has been 
established, the whole foundation of serological standardization is involved. Yet the 
measurement of the potency of such sera in Units seems often to work quite satis- 
factorily in practice. This is because we are usually dealing with very powerful sera 
for therapeutic use. These can be measured and used in high concentrations, so the 
differences described here are not detected and not important. But in research 
experiments, and in all cases where very small concentrations of antitoxin have to be 
measured, the difficulties become apparent. 

In the case of diphtheria antitoxin we have found that the observations can be 
described by assuming that toxin is neutralized by antitoxin in a reversible equilib- 
rium, 


At a high concentration this process goes almost entirely to the right. At low con- 
centrations a large percentage of the two substances remains free, and to neutralize 
the free toxin almost completely a surplus of antitoxin must be added. This surplus 
depends on the value of the equilibrium constant 


C24°Cp 


= K 
Caar 


which is different for antitoxins from different sera. 

At high concentrations the influence of K is negligible, and the “potency” of all 
sera corresponds to the re’ative potency that would have been found if they had the 
same K as the standard serum. But at low concentrations discrepancies will show 
up, depending on the difference in K between the test serum and the standard serum. 
These discrepancies are sometimes very large. The potency evaluation at a high 
concentration may for some sera be 10 times as large as when measured at a low con- 
centration. 

The antitoxin contents of an unknown serum can thus be described by two quan- 
tities: an estimate of the potency it would have had if its K were the same as the K of 
the standard serum, and an estimate of the actual K from which the neutralizing 
power of the serum at all concentration levels can be computed. These two quantities 
can be estimated from two experiments at different concentration levels, and in 
research experiments this procedure should certainly be followed. 

I have assumed that the determination of potency and of K is independent of the 
toxin used in these assays. If toxins of different quality should yield different esti- 
mates of K for the same serum, the matter becomes much more complicated. 

In routine assays, where the purpose is to determine the potency of highly con- 
centrated antitoxic sera, there are several reasons for disregarding these complica- 
tions. It seems quite sufficient to evaluate the potency of su ‘h sera in the same simple 
way as has hitherto been the practice, and to express the estimate of potency in Units. 

E.C. Fieller. Asan alternative to Dr. Irwin’s reconciliation of the various formu- 
lae advanced for computing the fiducial limits in biological assay, it would seem better 
to abandon the formulae, and compute the limits arithmetically, from the familiar 
quadratic equation. This can be regarded as assigning limits to the root @ of a simple 
equation, the coefficients of which are linear functions of normally distributed obser- 
vations; a similar method serves to assign limits to the roots of any equation with 
coefficients of that nature. 
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In connection with Dr. Irwin’s method of combining the information supplied by 
several independent assays, alternative methods have been outlined when the residual 
variances remain constant from assay to assay. One for use when the slope remains 
constant extended the method of a previous paper. Of the other two for use when the 
slope varies, one is based on work by D. V. Lindley, the other on work as yet unpub- 
lished by K. D. Tocher. 

For judging the inherent reliability of an assay method, there may be practical 
reasons for preferring the ratio b?/s? to that advocated by Bliss (A = s/b). For 
deciding in practice which of two rival assay methods to adopt, it is essential to con- 
sider their comparative costs per response as well as their comparative inherent 
reliabilities. 

J. Tripod. Ihave here the same feelings as a general practitioner at a meeting of 
specialists, who is picking up what he needs and selecting new methods according to 
their precision, simplicity and usefulness rather than for their theoretical basis. Asa 
pharmacologist, I look upon biometrics (1) as an aid to the better planning of experi- 
ments, (2) to find the probability of differences and (3) to obtain a sound standardiza- 
tion of specific pharmacological properties. But all of these are tools for enabling me 
to form a final judgment and never a goal in themselves. Like other tools, the value 
of a biometric method depends on its precision, simplicity, rapidity and economy. 
Some methods are relatively easy to apply, but in other cases, we encounter great 
difficulties either in the computations or in collecting a sufficient number of observa- 
tions. 

Given the same experimental data, we have many ways of estimating a relative 
potency. In “all or none” responses we found, for example, the following values from 
the same observations of the relative toxicity of 2 narcotics in mice perorally, in 
percentages 


Graphically Algebraically 
Trevan 32.8 Behrens 33.8 
Gaddum NED 32.9 Van der Waerden 34.1 + 5.7* 
Miller-Tainter 34.0 + 5.7* Bliss 33.7 + 5.6* 
Prigge 33.3 + 5.9* *P = 0.05 


With the method of Miller-Tainter I found the potency in some minutes, while I 
needed some hours and even some days without a calculator for the other computa- 
tions. This example is given to show that we are thankful to the statisticians who 
give us tools which we can use with more rapidity and more economy, but it is also 
fair to say that such simplifications are deduced from the more complicated methods. 

A second point which is very important for me as a pharmacologist is the unifying 
and internationalization of statistical symbols. This need may not be apparent in 
the USA, but I am afraid that the slower adaptation of European biology to bio- 
metrics is largely due to the confusion created by various symbols. I am very pleased 
to have heard that this question is under consideration. The teaching of biometrics 
would then be easier and more people without a good mathematical background would 
be attracted by the charm of statistics. 

This leads me to a third point. It is very well to apply statistical methods to 
physiology and pharmacology. But I believe that these sciences are also a means to 
find new treatments for curing sick people. Hence the statistical treatment of clinical 
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investigations is at least as important as that of animal experiments. It is unfortunate 
that many differences between new treatments cannot be analyzed because of hetero- 
genity of the cases, wrong planning of the treatment, insufficient number of observa- 
tions or even lack of controls. On the other hand, good statistical treatment of a 
specific effect on animals might give interesting hints for the clinic. Such uses in the 
clinic can only be promoted by teaching biometrics in medical schools or in post- 
graduate courses. 

Sir Percival Hartley. The question of the stability of biological standards is 
fundamental to the whole process of biological standardisation and the practice of 
expressing the potency of biological substances in units. It has engaged the attention 
of those concerned with the preparation of standards in the inter-war years, and con- 
tinues to do so. It is difficult to devise experiments which will detect small changes in 
potency, especially in those assays dependent on the determination of the LD50 dose. 
Other steps had been taken to ensure, as far as possible, the stability of the standards. 
Since deterioration is almost certainly associated with chemical change, it was con- 
sidered that if this could be reduced to a minimum stability should ensue. Accord- 
ingly, the standards are absolutely dry preparations, sealed in ampoules which are 
filled with pure dry nitrogen gas, and kept constantly in the dark and at low tempera- 
tures. Other experiments, in which the standards had been sent to India and Aus- 
tralia and back several times or exposed to other adverse conditions had given con- 
fidence that the preparation of the standards and their subsequent care, had been 
effective in preserving their potency. 

D. J. Finney. Dr. Irwin has described the design used for the collaborative 
vitamin D assays. In this design, the four solutions were compared by intra-litier 
contrasts, but the regression coefficient was based upon inter-litter contrast. The 
problem arises as to whether the true inter and intra litter regression coefficients of 
response on dose are necessarily identical. In general they will be, but special cireum- 
stances might make them unequal; this type of design, often used for other assay 
problems, would then give a misleading estimation of potency. In any event, the 
design has the disadvantage that the regression coefficient is likely to be estimated 
with relatively much less precision than the mean differences in response. 

By modifying the design used in the vitamin D assay, it is possible to alter the 
scheme of confounding between litters and to obtain an intra-litter estimate of 
regression without upsetting the estimation of the response differences. For example, 
with a standard preparation, S, and three test preparations A, B, C, a design for 
litters of four and three doses of each preparation would be made up of sets of nine 
litters as follows, the suffixes referring to the dose levels: 


Si Ai B; Cs IV: S; A; B, Cy VII: S: A; B, C2 
II: Si As; B, C3 ve Ss Ay B,; Ci VIII: Ao B, C, 
Si A; B; Ci Viz Ss Ai B, Cs EX: B, C2 


Three repetitions of this scheme would need 27 litters, and the experiment would then 
be comparable with the design for 30 litters actually used. 

Martin Leopold. On peut mentionner que dans |’essai biologique de la digitale, 
le taux du calcium peut jouer un réle trés important. En effet, en 1939, le professeur 
T. Labarse et le Dr. J. van Heveswinghels de Bruxelles ont montré que, par suite de 
ladministration intraveineuse de Calcium, la dose lethale de digitale pour le chat 
diminue considérablement. D’ou, dans les essais biologiques, la calcémie pourrait 
étre un facteur statistique 4 contrdéler. 

Other participants in the discussion included F. Bernstein and G. Rasch. 
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(1) OUTLINE OF A MATHEMATICAL THEORY OF PECK 
RIGHT 


ANATOL RAPOPORT 


University of Chicago 


INTRODUCTION 


C. Murchison (1935) has identified a number of “quanta of social 
phenomena” in the Gallus domesticus (the chicken). One of these is 
the so-called Social Reflex No. 1, observed in the movement toward 
another member of the species; another is Social Reflex No. 2, fighting 
a member of the species. He then established methods for measuring 
these quanta: Reflex No. 1 by the velocity of approach and Reflex 
No. 2 by willingness to fight and by success in battle. As co-variables, 
Murchison considers mass, momentum per second and kinetic energy 
per second in Reflex No. 1 and their relation to success in battle. 
Another co-variable is “‘social rank,” that is, the peck order position of 
the bird in its own flock. 

N. Collias (1942) points out that among hens “social rank” is 
established by the result of the first encounter. He then proceeds to 
analyze the correlations between success in combat with a member of 
a strange flock and other measurable characteristics, such as size of 
comb, degree of moulting, and social rank in the home flock. 

Thus the task of identifying, measuring, and correlating ‘social 
quanta” is under way in the study of events which may well be con- 
sidered simple social phenomena. In these studies, the unit is taken 
to be the individual, and the observables are his individual character- 
istics and his relations to other individuals within and outside his 
society (flock). 

We wish to attack the problem from a somewhat different point 
of view and by a different method. Out unit will be the society (a 
small flock of birds), and the observables will be the elements of structure 
of that society. Our method will be deductive. We shall make certain 
postulates concerning the determinants of social structure and by 
probabilistic analysis draw conclusions about the social structures 
likely to arise. 
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In the work of Murchison, Collias, and others, where correlation is 
sought between observables, it is desirable to reduce unknown and 
variable factors to a minimum, since these tend to reduce the sig- 
nificance of the correlations. But in our (probabilistic) approach, we 
shall on the contrary be more interested in random events. Ideal 
situations of this sort (where complete randomness reigns) yield no 
less significant regularities and laws than “deterministic” situations. 


GENERAL CONSIDERATIONS 


Peck order or peck right is a binary asymmetric relation between 
each pair in a finite aggregate of individuals (a society). The relation 
will be designated by > (or by <), so that A > B is read “A pecks 
B,” and, of course, A < B is read ‘‘A is pecked by B.” The relation 
is asymmetric—if A > B holds, then B > A does not hold, at least 
not simultaneously. Two facts about this relation (as observed in bird 
and other societies) are of interest to the mathematical biologist. 

1. The relation is not necessarily transitive, (A > B). (B > C) 
does not always imply A > C. 

2. The set of relations in a given society may change with time, 
tending to transform the structure of the society into a simple chain 
(Murchison, 1935; Allee, 1931), thus, 


A>B> (1) 


The first observation points to indeterminate factors in the dy- 
namics of such a society, while the second points to determinate factors. 

One could, of course, postulate a deterministic dynamics and still 
get intransitive relations. For example, the peck right between two 
individuals may be determined by the relative magnitudes of the values 
of a function f(x, , y»), whose arguments are variables associated with 
both individuals, so that A 2 B according to whether f(x, , ys) 2 
S(s , Yo). Then we may well have 


f(x. Ys) > f(x Ya) } f(x Ye) > f(z. Yo); f(z. Yo) > f(x. Ye). (2) 


One could, therefore, begin the logical development of the peck right 
problem by examining some such function f(z, y). The structure of 
the society would then be determined by the form of f and by the 
distribution of the characteristics among its members. To account for 
changes in structure, one could make f dependent either explicitly or 
implicitly on time. This approach does not seem attractive since the 
number of functions one may choose for f is prodigious, and one has 
no @ priori reason to prefer one to another. 

The probabilistic hypothesis, on the other hand, supposes that the 
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structure of a society is determined to a certain extent by the outcomes 
of chance events. This means, in the last analysis, that we are taking 
into account our ignorance of the determining factors in the dynamics 
of the society, but we are formulating this ignorance mathematically 
as is done in any probabilistic approach to a problem. It is also possible 
to combine deterministic with probabilistic factors, where, for example, 
one assigns to events probabilities which are functions of some known 
quantities. 

There is another reason besides mathematical convenience for in- 
troducing chance events into the dynamics of animal sociology. If, as 
is indicated by the observation above, the later stages of a society are 
more “organized” than the early stages, one could postulate the initial 
working of chance and an inherent bias in the situation which causes 
the structure to “gravitate” to a certain form. Thus, if one spills peas 
on an uneven surface, the initial distribution of the peas will be nearly 
random, but eventually they will assume positions along the lines of 
minimum potential energy. 


SOME POSSIBLE ASSUMPTIONS 


A natural choice of a chance eventris the result of an encounter 
between two individuals. We shall assume that in any such encounter 
one will enjoy “victory” and the other suffer “defeat.” It will be 
further assumed that the probabilities of the outcomes of such encounters 
can be computed on the basis of the knowledge of the history of the 
individuals and their inherent characteristics. The “structure” of a 
society, that is the distribution of social ranks, will be supposed to be 
determined by the outcomes of such encounters. 

One can make several different hypotheses about factors influencing 
the outcomes and about factors resulting from them. The following 
are examples of such hypotheses, not necessarily consistent with each 
other. 

1. The result of the first encounter between any two individuals is 
equiprobable. 

2. The probabilities of the results of the first encounter are functions 
of the respective characteristics (independent of time) of the indi- 
viduals concerned. 

3. The probability of each encounter depends on the result of the 
immediately preceding encounter between the same individuals. 

4. The result of each encounter depends on the total history of 
encounters experienced by the individuals concerned. 

5. Relative social rank between two individuals is completely de- 
termined by the result of their first encounter. 
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6. The result of each encounter determines the relative social rank 
between two individuals, but other encounters may ensue in which the 
rank may be reversed. The probability of the reversal depends on the 
difference in the social rank between the individuals. 

7. The probability of an encounter taking place at all is a function 
of the difference in social rank. 

Ete. 

One could begin with any self-consistent combinations of these 
assumptions and derive its implications. These implications would be 
conclusions about the probability distributions of various “‘types”’ of 
societies, and, if the frequency of encounters were also known or as- 
sumed, one could (in principle) derive expressions for the changes of 
these probability distributions with time. One should thus obtain a 
“wave” of probability distribution. The abscissa of this wave would 
be a particular type of structure of a society (to be defined below); the 
orcinate would be the probability frequency of this structure; and the 
whole wave would be moving into the third (time) dimension, changing 
its shape as it moved, so that after a long time it would exhibit a sharp 
crest at the type of structure toward which societies tend to gravitate. 
Thus a surface would be generated by this probability wave which 
could be considered as a complete representation of the “statistical 
dynamics” of the society. 


TYPES OF STRUCTURE 


It remains to define the “type of structure” or simply the “struc- 
ture’ of a society. Evidently a definition is desired which would 
enable us to recognize structure by observation. In a society of N 
individuals, each will have N — 1 peck right relations of which r will 
be dominant (the right to peck).and N — 1 — r submissive (the neces- 
sity to be pecked). There will then be a distribution of numbers 


, 72, Ty) among the N individuals such that 
N 
Dr: = — 2). (3) 
i=1 

This set of numbers (7; , 72 , +++ Ty) together with all its permutations 


will be defined as a structure of the society. A structure can also be 
represented by a diagram, where each individual is placed on a level 
determined by the number r,; of dominant peck right relations he 
enjoys. Thus a society of four individuals can have exactly four 
structures, diagrammatically represented in Figure 1. Arrows indicate 
peck right; the numbers indicate the dominant peck right relations of 
each individual. 
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(a) (b) (c) (d) 


FIGURE 1. 


The problem of determining the dynamics of such a society, given a 
set of assumptions about the occurrence and results of encounters, 
reduces the problem to the calculation of the probability of occurrence 
of each structure at each time in the history of the society. 

We shall first make the following simple assumptions. 

1. The results of the first encounter between any two individuals 
are equiprobable. 

2. The result of the first encounter determines permanently the peck 
right between the individuals concerned (i.e., the winner will always 
peck the loser). 

Thus the shape of the probability distribution wave is permanently 
established (independent of time) as soon as all the 3N(N — 1) en- 
counters have taken place. An alternative interpretation is that we 
are limiting our observation to the period when the structure depends 
on the results of first encounters only. 


THREE INDIVIDUALS 


Since only one type of structure is possible for a society of two 
individuals, namely, (1, 0), i.e., the “dominant” individual pecks one 
individual, and the submissive one pecks none, we shall begin with the 
case of three individuals. 

There are 2° = 8 possible sets of outcomes of the three encounters. 
Each outcome leads to one of two possible structures, (2, 1, 0) and 
(1, 1, 1). That is to say, in one type of society, (2, 1, 0), there is one 
individual who pecks two others, one individual who pecks one other, 
and one who pecks none. In the other type (1, 1, 1) each individual 
pecks one other (and, of course, is pecked by one). The respective 
probabilities of these structures are { and j, since six of the eight out- 
comes map on the first (simple chain), and two on the second (simple 
cycle). This solves the problem of the three individuals. 
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SOME ASPECTS OF THE N-INDIVIDUALS PROBLEM 


One would like to generalize the solution to N individuals. How- 
ever, difficulties are encountered even under the extremely simple 
assumptions made above. 

The probability of an event is defined as the ratio of the number 
of ways the event can happen as a result of certain circumstances to 
the possible occurrences, assumed equally likely. Now in a society of 
N individuals there will eventually take place $N(N — 1) first en- 
counters. Since there are two possible results to each encounter, the 
number of all possible occurrences will be 2'“““~”. To compute the 
probability of a certain structure, one must calculate how many of 
those “‘sets of results’ will map on a particular structure, that is a 
particular distribution of peck rights (r,; , r2 , «++ Ty) or any permuta- 
tion thereof. This would be a simple matter if there were a one-to-one 
correspondence between each set of results and each ordered set 
(r,; , 72, °** Tw). Then the probability of the structure (r, , r2 --- rw) 
would be simply the number of distinguishable permutations of the set 
divided by 2!"°-”. That the mapping is not in general one-to-one 
is shown by the following counterexample. 

Each set of results can be represented by a skew-symmetric matrix, 
for example, in a society of four individuals by 


A B 
1| 


A 

B} 1 1 
(4) 

C 


where a 1 in the X-row, Y-column indicates X > Y anda —1,X < Y. 


Note that the matrix (4) maps on the structure (1, 2, 1, 2). But so 
does the matrix 


A B C D 


(5) 
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Without solving the problem of determining the probability dis- 
tribution of structures for all N, it is nevertheless possible to re-word 
it in such a way as to state the results in terms of properties of such 
skew-symmetric matrices. We shall here state the result proved else- 
where (Rapoport, 1949a). 

Consider the class 2% of all skew-symmetric matrices (a;;) of order 
N, all of whose non-diagonal elements are units, that is, a;; = 1 or 
—1;a,; = —a,;;. We then have the following 


Theorem 1. Let (a;;) be a matrix of class Jt and let 


N N N 
be the ordered set of its row sums. Let p, be the number of distin- 
guishable permutations of the set S, and m, the number of matrices 
in M giving rise to the set of row sums S,. Then the probability of 
the structure , , Tw) Where 


1 N 
r= a;+N— 1) (6) 


i=1 
is given by 


All the structures (r; , r2 , -** 7y) are such that the r; may be put in. 
the form (6). 

A general method of finding m, for any S, and any N is at this 
time unknown to the author. However, for N not too large, say, N < 6, 
the probabilities of all structures can be easily computed by a systema- 
tized method (Rapoport 1949a). 

A sample table for N = 5 is given below. 


Structure Probability 
(4, 3, 2, 1, 0) 15/128 
(3, 3, 3, 1, 0) 5/128 
(3, 3, 2, 2, 0) 15/128 
(3, 3, 2, 1, 1) 30/128 
(4, 2, 2, 2, 0) 5/128 
(4, 2, 2, 1, 1) 15/128 
(4, 3, 1, 1, 1) 5/128 
(3, 2, 2, 2, 1) 35/128 
(2, 2, 2, 2, 2) 3/128 
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The foregoing results implicitly predict the distribution of struc- 
tures in a large number of societies of a given number of individuals, 
provided the hypotheses used in their derivations are valid. On the 
other hand there are strong indications that in any actual flocks sig- 
nificant bias factors are at work. We should not therefore expect the 
distribution in actual flocks to follow the computed distributions given 
here. They may be expected to hold figuratively speaking only in 
vacuo. However a liaison between the above theoretical considerations 
and experimental procedure can be accomplished in a variety of ways. 

1. The above results can be used as a theoretical stepping stone for 
complicating the theory by the introduction of biases known to be 
important. 

2. A crude prediction can be made even on the basis of the results 
obtained, namely, that in a situation where peck right is determined 
by the results of the first encounter, the probability distribution will 
depart from that predicted here so as to weigh more heavily those 
structures which are closer to complete hierarchy (simple chain). For 
example, on the basis of random results of encounters, the probability 
of a complete hierarchy in a society of three individuals is 3. With the 
introduction of some bias, we would expect it to exceed that value. 

3. Whereas the statistical work of correlating physical charac- 
teristic and social rank with success in combat is most significant 
when the opponents are not evenly matched, the results of the present 
theory can be best realized when they are matched as evenly as possible. 
Thus experimental work can be extended to another range. 


“SOCIAL MUTATIONS” 


A society of N individuals may have n(N) possible structures. Let 
us now suppose that from time to time encounters take place between 
pairs of individuals. The result of each encounter may be either the 
preservation of the old peck right relation between the two individuals 
or its reversal. If the peck right relation is preserved, then certainly 
the structure of the society is also preserved. On the other hand, if 
the peck right relation is reversed, the structure may be changed, or it 
may not. For example, the structure S, : (2, 1, 0) of the three-indi- 
vidual society is diagrammatically represented as in Fig. 2(a) 

A reversal of peck right between individuals 2 and 0 changes the 
structure to S, : (J, 1, 1), as in Fig. 2(b) 

But the reversal of peck right between 2 and 1 results again in the 
structure (2, 1, 0) with the individuals simply relabelled as in Fig. 2(c). 

We shall refer to changes in structure as “mutations” and will 
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(a) (b) (c) 


FIGURE 2. 


denote the mutations S; — S; by S,; . The probability of the occur- 
rence of S;; will be denoted by a;,; . 

In general, not every mutation S;; can be accomplished by a single 
reversal of peck right. When the mutation S; — S,; cannot be ac- 
complished by a simple reversal, the probability of S;; is 0 (a;; = 0). 
Otherwise a,; (¢ # Jj) will depend on the probability of an encounter 
between a pair of individuals which may result in S;; and on the prob- 
ability of the victory going to the previously ‘‘submissive” individual. 
However, in estimating the ultimate fate of the society (the limiting 
probability distribution of its possible structures), one may use the 
parameters a,; directly and compute the limiting distribution in terms 
of these. The computation of the a;; in terms of the probabilities of 
encounters and reversals is a separate problem. We shall first state 
the problem of determining the limiting distribution in terms of the 
- : 
Let S,(t) be the probability of occurrence of the structure S; at 
the time ¢t. Take as a unit of time the average interval between en- 
counters. Then 


S(t) = auSi(t — 1) + ay, S.(t — 1) + + 4, 8,(t — 1) 
S.(t) = 428,(t — 1) + — 1) + + — 1) (8) 
= — 1) + +++ + — 1) 


Note that the a;; are the “identity mutations,” that is, a;; is the 
probability of the preservation of the structure S; . We denote the 
matrix of the probabilities by (a;;). The attention of the reader is 
called to the fact that the subscript notation is the reverse of the 
conventional one, a,;; being the element of the i-th column and the 
j-th row. We have chosen this departure from convention in order to 
keep the suggestion that a;; stands for the probability of S; mutating 
to S; ° 

If S(t) is the vector {S,(é), S.(t) --- S,()}, equation (8) may be 
written in vector-matrix notation thus, 


wap 
de 
4 
+ 
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S(t) = (a;;)S@ — 1) (9) 
Hence by iteration, 
S(t) = (a;;)‘S() (10) 
and 
S() = Lim (a)‘S(0) (11) 
Theorem 2. There exists a matrix (a;;) = Lim,.. (a;;)‘, such 


that the columns of (a@;;) are all identical. The vector represented by 
any of these columns is the limiting distribution vector S(@), and is 
independent of the initial distribution S(0). 

This result is known to statisticians familiar with the Markoff 
process. A proof based on mathematical induction and multiplica- 


tion of matrices is given in another paper of the author (Rapoport 
1949b). 


CALCULATION OF THE LIMITING DISTRIBUTION 


The proof of Theorem 2 implies the existence of a vector S in- 
variant under the transformation (a,;;), that is, 


(a,,)8 = 8; DS, =1 (12) 


The vector S is the limiting distribution vector and can be found by 
solving the system of n — 1 linear equations, 


S; 


+ aS, an(1 = s.) 


S, 


+ + + al 1 (13) 


n-1 
Sa-1 = (n-1) + + s.) 


t=1 


Denote by (b;;) the n — 1-rowed matrix obtained by deleting 
n-th row and n-th column of the matrix (I — (a;; — a,;)), where I is 
the identity matrix, and denote by (b;;)“ the matrix obtained by 
replacing the 7-th column of (b;;) by the vector , 
Then by Cramer’s Rule the components of the limiting distribution 
vector S are given by 
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s, Lea” | = 1,2, 


| 


EXAMPLE: SOCIETY OF THREE INDIVIDUALS 


In case N = 3, n = 2, and the system (13) reduces to a single 
equation, 


= a8, + an(1 — (15) 
whose solution is 
S, = — + (16) 
Then obviously 
S, = 1— 8, = (1 — ay)/(1 — au + aa) (17) 


We can now make assumptions concerning probabilities of en- 
counters and victories, and on the basis of these assumptions compute 
a,, and a,,. If an encounter between any pair of individuals is equally 
likely and the probability of victory does not depend on the peck 
right relation existing before the encounter, that is, it is 4 for each 
individual, the calculation of the a;; is quite simple. Since the muta- 
tation S,, can occur as a result of any encounter (Cf Figure 2(b)), 
provided the peck right relation is reversed, we have a2, = 3. On the 
other hand, S, is preserved in 5 out of 6 possible encounters as can be 
seen from Figure 2, Hence a,, = 5/6, and 


a 1/2 
S = 73/4 1/4 (18) 


Note that this distribution is also the initial distribution for N = 3 as 
has been shown above to result from random initial victories. 

The “mutation” method, however, enables us to introduce biases 
which may depend on inherent properties of individuals and on their 
social rank. Thus the probability of encounter between individuals of 
wisely different social rank may be taken to be smaller than that 
between individuals of nearly equal rank. Likewise the probability 
of victory of a dominant individual may be taken to be greater than 
that of the submissive individual, etc. Hysteresis phenomena may 
likewise be introduced, that is, the dependence of victories on the 
number of past victories enjoyed by the individual, etc. 
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(2) ON THE MATHEMATICAL THEORY OF THE 
EQUILIBRIUM OF INTERACTIONS BETWEEN 
PROTEINS AND OTHER SUBSTANCES 


Enzo Boeri 


Istituto di Fisiologia, Universita di Napoli 


A PROTEIN P has n sites capable of binding another molecule A, the 
active concentration of which is zx. The interaction between P 
and A proceeds along n steps of the type 


PA,,+ APA, 


Each one of these steps is characterized by an equilibrium constant 
K,; . It has been known since a long time (e.g. Adair 1925) that the 
saturation y of P with A varies between zero and unity according to a 
relation of the form 
y= Set Tx, (1) 
i= i= i= 

It is also known that for the case n = 1, the relation (1) is a rect- 
angular hyperbola. Moreover, (1) reduces again to a rectangular 
hyperbola in cases in which n > 1 and the relative values of the suc- . 
cessive equilibrium constants are determined solely by statistical factors. 
In those instances 
n—(i—2) 7 
n—-(i-—li-1 (2) 


see for instance Klotz 1946. In many instances n > 1 and the relation 
K,_,/K;, is more complex than in equation (2). This is the case when 
the successive binding of A modifies the reactivity of P in a way more 
complex than it is to be expected by the gradual decrease of free spaces 
on P. Neighbour ligated A molecules may mutually interact (Pauling 
1935), electrostatic factors may be present (Klotz, Walker and Pivan 
1946), or eventually the affinity of the active groups on P may be 
thought as having values distributed in such a way as to be adequately 
described by a Gauss error function (Pauling, Pressman and Grossberg 
1944, Karush and Sonenberg 1949). In these cases the plots of y versus 
zx do not give hyperbolic, but sigmoid curves. 

The purpose of the present paper is to call attention on the analogy 
of the aforesaid problem with that of mono- and multi-layer adsorp- 
tion. Let now zx represent the relative pressure of a gas, the ratio of 


K,.:/K; = 
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actual pressure to the saturation pressure; n is the number of layers 
and y the saturation of the gas on the adsorbent. Then, according to 
the BET theory of multilayer adsorption (Brunauer, Emmett and 
Teller 1938) 
y= > ice’ / + (3) 
t=1 t=1 
where c is a constant. For n = 1 a plot of y versus x gives a rectangular 
hyperbola (Langmuir’s monolayer) and for n > 1 sigmoid curves. 
In both equations (1) and (3) if we let nf represent the denominator, 
we have 
y = xdf/f dx (4) 


Equations (1) and (3) differ for the fact that (3) has but one con- 
stant, whereas (1) has n constants (although all may be derived from 
K, by some relation). 

Strangely enough, the analysis of the interaction equilibrium be- 
tween P (a protein) and A (another substance) is sometimes pretty 
well represented for y < 0.7 by the inverse transformation of the BET 
equation (3) for n layers, of 


_ _zecyll — (n+ + 
— + ( — Dy + cy] 


the constants x, and ¢c being graphically determined according to the 
method of Wolmen and Andrews (1948). Also the inverse transforma- 
tion of Pickett’s simplified equation (1945) may be used (see Boeri 
1947). It is surprising to see how often n turns out to be the right 
number, in agreement with the experiment. Examples are presented. 

Thus far, I was not able to give sound logical reasons for this co- 
incidence. 


(5) 
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(3) CONTRIBUTION A L’ANALYSE FACTORIELLE D’UN 
TEMPS D’INCUBATION 


Scuwartz, JEAN Cuzin, ET ANDRE RENIER 
Mosaique du Tabac 
Abstract 


E TEMPS D’INCUBATION pour des pieds de tabac inoculés avec le virus 

de la Mosaique a fait voir une variation suivant la souche du virus, 
le génotype de la plante et les conditions de milieu. Les auteurs ont 
étudié certains facteurs pouvant affecter la taille et le taux de croissance 
de pieds d’une variété fixée de Nicotiana Tabacum L. avant leur inocula- 
tion avec une suspension homogéne de protéine-virus purifiée. Ils 
inoculérent, le méme jour, la feuille 18 sur chacun de 265 pieds. Cing 
jours plus tard, en moyenne, les feuilles inoculées furent amputées et 
en fin d’expérience, on dénombra 120 pieds mosaiqués et 145 pieds sains. 

Une étude statistique fut faite des facteurs suivants: (1) état du pied 
(= 0 pour pied sain et 1 pour pied malade), (2) longueur de la feuille 18 
trois semaines avant ]’inoculation, (3) longueur de la feuille 18 le jour de 
V’inoculation, (4) quantité de feuilles sur le pied le jour de |’inoculation, 
(5) taux de croissance de la feuille 18 pendant la semaine avant l’inocula- 
tion et (6) taux de croissance du pied pendant la semaine avant |’inocula- 
tion. 

Le coefficient de corrélation partielle de la variable dépendante (1) 
avec chacunes des autres était Tyo 3455 = —-.27, = 
T14,0386 = —-24, = —-09, et = .07. La limite de significa- 
tion pour P = .05 était de +0.12. 

Les résultats furent interprétés au moyen de courbes de croissance 
schématiques des feuilles 18 pour un pied malade et un pied sain. Ainsi, 
par exemple, parmi des pieds de feuilles 18 égales ou croissant 4 des taux 
égaux au jour de l’inoculation, la longueur de la feuille 18 était en mo- 
yenne moindre trois semaines plus tét chez les pieds qui devinrent 
malades; de sorte qu’ils croisszient relativement plus rapidement avant 
Vinoculation. Une explication semblerait possible: c’est qu’une sub- 
stance ayant une concentration proportionelle 4 l’A4ge du pied mais non 
au stage de dévelopement de la feuille pourrait s’étre diffusée dans la 
feuille retardant la propagation du virus. 

Une expérience faite plus tard est mentionée briévement, touchant 
effet de V’illumination. Le fait de placer une feuille 4 l’obscurité dans la 
quinzaine précédant l’inoculation raccourcit considérablement le temps 
d’incubation. 
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(4) FREE HAND CURVES IN ESTIMATING THE POTENCY 
OF HUMAN SERA AGAINST TOXOPLASMA 


Cu. A. G. Nass 


Institute of Preventive Medicine, Leiden 
Abstract 


eer HUMAN SERA were subjected to a ‘‘Sabin-Test”, in which 
constant volumes of a standard culture of the blood parasite Toxo- 
plasma were exposed to each of three doses (1:100, 1:20, and 1:4) of each 
serum. After staining, a slide count of the number of unstained para- 
sites in 100 gave an estimate (y) of the percentage kill of Toxoplasmas by 
the given dose of serum. The potency of each serum was to be evaluated 
from its three y-values corresponding to the coded log-doses x — 1, x and 
z+1. 

It was assumed provisionally that the dosage-response curves for the 
individual sera could be superimposed by shifting them along the log- 
dose axis, as if the sera were different dilutions of a single active agent in 
an inert carrier. With a single curve of the expected response (Y) plotted 
against the log-dose x it would be possible to estimate the content of 
agent in any given serum from its observed y-values. In the absence of 
analytical techniques, this assumed curve has been fitted visually. The 
method is not recommended as a labor-saver. 

The three responses, y(« — 1), y(x) and y(« + 1), for each of the 70 
sera were listed in order of increasing }-y and averaged in 14 sets of five. 
In the first figure the values of y(x — 1), y(x) and y(x + 1) were plotted 
against y(x) and fitted visually with three curves. Y(a — 1), Y(x) and 
Y(x« + 1). !m this first approximation the relative potency of a serum 
was estimated from the middle dose alone by setting y(x) = Y(z), 
rather than from the more stable estimate }-y = )“Y. Ina later stage 
of fitting, each of the 14 sets was replotted against Y(x) at a level such 
that >>(y — Y) = 0. From the original assumption that all sera were 
solutions of a single active agent, the diagram was necessarily sym- 
metrical, so that at any point on the Y(x) diagonal the horizontal dis- 
tance to the Y(x + 1) curve equalled its vertical distance to the Y(x — 1) 
curve. 

Two transformations of the abscissae of these curves aided in adjust- 
ing the free-hand diagrams to their final form, the ordinate remaining the 
same throughout. In the second figure the abscsissa was changed to 
Y(x — 1) + Y(a) + Y(ax + 1), so that sets could be located directly by 
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the condition }°y = )°Y. While the curves were not symmetrical, 
rectangles drawn parallel to the axes had two opposite corners on the 
Y(z) curve and the other two corners on the Y(z — 1) and Y(z + 1) 
curves. Finally, with the aid of the first two graphs to determine the 
log-dose x of the active agent, the curves were replotted in a third chart 
against x with z = 0 at Y = 50. The Y(x — 1), Y(z) and Y(x + 1) 
curves then became sigmoid and identical except for position, with lower 
and upper asymptotes at Y = 0.4 and 82% respectively. An additional 
scale on the third figure facilitated reading x for a given serum directly 
from its observed value of cy = -Y. 

The three graphs emphasized different aspects of the data and thus 
complemented each other. In making the final adjustments, the three 
curves were held to the rules for transforming one graph to another and 
were so placed as to minimize the runs of points on one side of a curve. 
When fitting was completed, the number of runs was nearly exactly what 
would be expected in a random sequence, as tested statistically. 

Of the three degrees of freedom for each serum, one was lost in com- 
puting its potency. The variance components for the other two degrees 
of freedom have been defined and evaluated on the assumption of a 
binomial distribution of the observed y’s about their expected Y’s. One, 
the “slope variance’, measured the failure of the curves for the indi- 
vidual sera to follow the general Y(x) curve. This was significantly larger 
than the second or residual “error variance’’. It is apparent that all sera 
could not be represented by one curve or a single active agent. 
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DISCUSSION ON CONTRIBUTED PAPERS 


(1) OUTLINE OF A MATHEMATICAL THEORY 
OF PECK RIGHT 


A. Rapoport 


M. Schutzenberger. The paper of Rapoport shows us the broader scope of biom- 
etry. It also shows that there is some basic difference between statistical biology and 
mathematical biology even if we are sure that the laws of a rational science of biology 
will be those of a mathematical biology. This leads to two points. 

The first is the importance of observation and experiment in the selection of 
initial assumptions. This is especially true in fields more complicated than the exam- 
ple chosen by Rapoport, that of human relations for example. The actual situation, 
when analyzed with adequate statistical tools, will offer us assumptions that we never 
would have thought of, a priori. 

Secondly, we feel that the real need of mathematical biology is not the mechan- 
ical application of the sort of mathematics which was so successful in physics. New 
mathematical tools, must be acquired, especially devised for biology. We wish to 
congratulate Dr. Rapoport for doing this by showing how simple combinatorial 
analysis can be used in biological situations. Although the statistical and the mathe- 
matical schools have different points of departure, it is to be hoped that we will meet 
in the construction of rational models which can be used to describe the structure of 
social groups, models which possibly can be used to modify such group structure. 
If we take care always to formulate our models so they are capable of experimental 
disproof, how can we fail to build a rational biology? 


J. B.S. Haldane. Mr. Rapoport trouvera la discussion d’un probléme analogue 
dans le dernier chapitre du premier volume de ‘“‘The Advanced Theory of Statistics” 
de Kendall, ov il s’agit de la consistance logique d’une échelle de préférences. 


(2) ON A MATHEMATICAL THEORY OF THE EQUILIBRIUM 
OF THE INTERACTIONS BETWEEN PROTEINS AND 
OTHER SUBSTANCES 


E. Boeri 


Leopold Martin. La communication du Dr’ Boeri est trés intéressante. En 
effet, quoique biochimiste experimental le Dr. Boeri tend 4 étudier plus intimément 
le fonctionnement intime de la matiére vivante du point de vue physio-chimique. 
L’isotherme d’adsorption Langmuir considérée entrautre ici 4 d’ailleurs été largement 
utilisée par Hinshelwood dans son livre ‘‘Chemical Kinetics of the Bacterial Cell’’ et, 
dans le cas particulier de l’adaptation du B. Piocyanique 4 la Streptomicine, par 
R. Linz et L. Martin (C. R. Soc. Biol. mars 1949). 
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NEWS AND NOTES 
Chicago Meeting 


At the Annual Meeting of the American Statistical Association, 
December 27-29, 1950, in Chicago in the Congress Hotel, The Biometric 
Society (ENAR) and The Biometrics Section of the American Statistical 
Association will hold jointly the following sessions: 


December 27, 10-12 A. M. 
Topic : Statistical problems in radio-biology. Chairman: A. E. Brandt. 
Papers: (1) Gene mutations in populations, Bruce Wallace. (2) Some 
tracer chemistry experiments with proteins, S. Lee Crump. (3) 
Metabolism of labeled carbon compounds, Hardin B. Jones. 


December 27, 2-4 P. M. 
Topic: Theory of variance components. Chairman: W. J. Youden. 
Papers: (1) The present status of variance component analysis, S. Lee 
Crump. (2) Testing a linear relation among variances, W. G. Cochran. 
(3) Application to regression and to errors of measurement, John Tukey. 


December 27, 4-6 P. M. 
Topic: Measurement of Morbidity. Chairman: Harold Dorn. 
Papers: (1) Is the household survey essential in securing morbidity 
statistics? S. D. Collins and T. D. Woolsey. (2) Prepaid medical care 
as a source of morbidity data, Neva R. Deardorff. (3) Experience of 
associated hospital service, Allen Thompson. 


December 27, 4-6 P. M. 
Topic: Precision of measurements. Chairman: W. Edwards Deming. 
Papers: (1) The specification of precision of measurements, Churchill 
Eisenhart. (2) The estimation of precision of measurements, Frank E. 


Grubbs. (3) Estimate of precision of textile instruments, John C. 
Whitwell. 


December 27, 8-10 P. M. 
Party 


December 28, 9-10 A. M. 
Biometrics Section business meeting. 
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December 28, 10-12 A. M. 
Topic: Statistical methods in pharmacology and immunology. Chair- 
man: Lloyd Miller. 
Papers: (1) Collaborative bio-assays, Lila Knudsen. (2) Statistical 
methods in immunology, Herbert C. Batson. 


December 28, 2-4 P. M. 
Topic: Applications of variance components. Chairman: G. W. 
Snedecor. 
Papers: (1) Variance components as a tool for the analysis of sample 
data, Walter A. Hendricks. (2) Consistency of estimates of variance 
components, R. E. Comstock and H. F. Robinson. (3) Use of com- 
ponents of variance in preparing schedules for the sampling of baled 
wool, J. M. Cameron. 


December 28, 4-6 P. M. 
Topic: Sample survey techniques. Chairman: W. F. Callander. 
Papers: (1) A consumer survey, Arnold J. King. (2) Approaches to 
agricultural price statistics, F. E. McVay and Henry Tucker. (3) 
Problems in rural surveys, R. L. Anderson and A. L. Finkner. 


December 29, 9-10 A. M. 
The Biometric Society business meeting. 


December 29, 10-12 A. M. 
Topic: Statistical methods in medicine. Chairman: Hugo Muench. 
Papers: (1) Survival curves for special diseases, Joseph Berkson. 
(2) Some designs used in clinico-physiological experiments, Donald 
Mainland. (3) Multivariate analysis in medical research, James A. 
Rafferty. 


December 29, 2-4 P. M. 
Contributed papers. 


December 29, 4-6 P. M. 
Contributed papers. 


Cleveland Meeting 


A meeting of The Biometric Society, Eastern North American Region, 
will be held on December 27, 28 and 29 jointly with the meetings of the 
A.A.AS. in Cleveland. Several sessions will be devoted to a symposium 
on mathematical biology and biometry with a number of invited speakers 
participating. 
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Members desiring to present 15 minute papers at the meeting are 
requested to send in the titles of the papers, together with a 200-word 
abstract, to N. Rashevsky, Committee on Mathematical Biology, The 
University of Chicago, 5741 Drexel Avenue, Chicago 37, Illinois. 


Heterosis Conference 


A Heterosis Conference was held June i3 to July 13, 1950 at Iowa 
State College. The purpose of the conference was to summarize the 
available information on methods by which the heterotic effects may be 
attained, the known causes which are responsible for these effects and 
the problems which require further research for their solution. 

As a supplement to the Heterosis Conference, a Methods Workshop 
was held from July 3 to 14. Statistical methodology for analysis of data 
from breeding experiments and statistical aspects in the design of such 
experiments were presented and discussed. John Gowen acted as chair- 
man of the Heterosis Conference. Jay L. Lush, Iowa State College, and 
R. E. Comstock, Institute of Statistics, were joint leaders of the Work- 
shop. 


Thomas Parran, The University of Pittsburgh, Graduate School of 
Public Health has called to our attention the first bulletin of this 
Graduate School. The teaching and research activities of the Depart- 


ment of Biostatistics are aimed primarily at the development of methods: 


for the statistical appraisal of the health problems of groups: the com- 
munity, the family, and special aggregates such as the population in 
industry and in school. Mr. Parran writes, “‘The curriculum has been 
devised to provide a progressive demonstration of the means by which 
statistical reasoning applied to the several medical and biological sci- 
ences can help to solve health problems of groups through the determina- 
tion of health needs, evaluation of efforts to meet these needs, and the 
measurement of the influence of social and biological factors on health 
and disease in man.” . .. Walter D. Foster, Biometrician, Department of 
Biochemistry, West Virginia University, received his Ph.D. degree in 
experimental statistics at the Institute of Statistics, Raleigh, and is now 
serving as statistician for the Nutritional Status Project in the North- 
eastern Region. . . . David J. Finney, University of Oxford, England, 
was married to Mary Elizabeth Connolly on April 11... . C. I. Bliss, is 
recovering from a heart attack. . . . Marianne Bernstein is in Oslo, 
Norway. She writes, “After having been in five different European 
countries, I must say that Norway is the most austerity ridden country 
in Europe. So far I have been most impressed by the climate in Italy, 
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especially in Rome. During the six weeks I was there it only rained 
three days. I was able to do a lot of sightseeing. Sometimes my father 
came along explaining; he had studied art in Italy. I endulged in Roman 
cooking. What a delight to be able to eat twenty different vegetables, 
numerous kinds of fruits and unrationed meat and butter. I was most 
impressed in Rome by G. Gini’s home. Several rooms are used as 
studies, and the walls are covered with book shelves.” . . . Ralph Bradley 
from McGill University, Montreal, has joined the staff as associate 
professor in the Department of Statistics, Virginia Polytechnic Institute. 
He is doing research on rank order statistics. Mr. Bradley received his 
B.A. and M.A. degrees in mathematics from Queen’s University and his 
Ph.D. degree in mathematical statistics at the University of North 
Carolina. .. . David Duncan, senior lecturer in statistical methods at the 
University of Sydney, Australia, will join the statistical staff of Virginia 
Polytechnic Institute on September 1. Mr. Duncan holds a B.Sc. degree 
in Agriculture, a B.S. degree in mathematics from the University of 
Sydney and a Ph.D. degree in mathematical statistics from Iowa State 
College. . . . William Feller, formerly with the Department of Mathe- 
matics, Cornell University, Ithaca, has been appointed Eugene Higgins 
Professor of Mathematics at Princeton University. ... Donald Mainland, 
Professor of Anatomy, Dalhousie University, Halifax, Nova Scotia, 
Canada, since August 1, is Professor of Biostatistics, Department of 
Preventive Medicine, New York University—Bellevue Medical Center. 
... Paul L. Munson, Research Associate, Department of Pharmacology, 
Yale School of Medicine, has been appointed Assistant Professor of 
Dental Science, Harvard School of Dental Medicine. .. . J. E. Morton, 
is on leave from Cornell to serve as Chief, Statistical Research and De- 
velopment Staff, Division of Housing Research, Office of the Adminis- 
trator, Housing and Home Finance Agency, Washington, D. C. .. 
Allan E. Paull, who has been at the Grain Research Laboratory, Winni- 
peg, Manitoba, Canada, is now with Abitibi, Power & Paper Company, 
Toronto. .. . H. Fairfield Smith, formerly Statistician with the Rubber 
Research Institute of Malaya, is now a Professor with the Institute of 
Statistics, Raleigh, North Carolina, 
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